MIRERC 047/2025: Clustering-Based Client Selection Technique for Federated Learning in Heterogeneous Environments
Abstract
Federated Learning (FL) enables decentralized intelligence by allowing multiple clients to collaboratively train models while preserving data privacy. However, the diversity inherent in heterogeneous environments, ranging from varied data distributions and computational capabilities to fluctuating network conditions, proposes significant challenges to achieving efficient and accurate model convergence. This thesis proposes a clustering-based client selection technique aimed at addressing these challenges. The proposed framework groups clients based on key performance indicators and data characteristics, ensuring that only subsets of clients with similar profiles participate in each training round. The clustering mechanism optimizes the selection process by identifying groups where the aggregated local model updates are most beneficial to global convergence. This not only minimizes communication overhead by reducing redundant or misaligned updates but also mitigates the adverse effects of data and system heterogeneity. The technique dynamically adjusts to the evolving environment, re-clustering and reassigning clients as necessary to maintain optimal learning conditions throughout the training process. Simulation-based experiments and real-world data validations will be used to validate this framework. Evaluation metrics such as model accuracy, convergence speed, and communication cost will be used to benchmark the performance improvements over traditional client selection techniques.