Project Details
Description
Federated Learning (FL) is a Machine Learning (ML) technique that enables multiple ML devices (workers) to learn statistical inference without exchanging data across these devices. Although many FL algorithms have been proposed in the literature, it is not clear whether the proposed FL algorithms can actually converge to the centralized version, in which all distributed datasets are placed and used in training in one single server. There have been some studies investigating the performance gap between FL and the centralized version; however, these studies only consider deterministic bounds. Since all FL algorithms belong to the category of statistical ML, only statistical performance bounds should be considered. This project represents the first attempt to explore thestatistical fundamental limits of FL. Specifically, we propose to develop a thermodynamics approach and apply the ergodic theory in thermodynamics to the study of the necessary and sufficient conditions for the existence of statistical convergence bounds. We will study the fundamental limits of FL training performance from three aspects, namely, computation efficiency, communication efficiency, and impact of heterogeneous data. We will analyze the statistical convergence properties of FL from these three key perspectives. First, to improve computation efficiency, we propose to employ a first-order momentum method, Nesterov Accelerate Gradient (NAG)momentum, on both worker and aggregator sides to increase FL convergence speed; furthermore, the interaction between worker and aggregator momenta will be analyzed, forming the basis for optimizing the momentum factors. Second, to improve communication efficiency, we will develop a three-tier hierarchical FL framework. It restrains communication overhead within local networks while leveraging momentum acceleration at all three tiers (worker-edge-cloud). Third, to mitigate the negative effect incurred by heterogeneous data, we will develop contrastive aggregation personalization, contrastive model personalization, and contrastive momentum personalization inspired by contrastive learning and knowledge distillation.The proposed research has the potential to significantly advance the biomedical industry, which is characterized by its sensitivity to diagnosis accuracy, diagnosis time, and patient data privacy. This is because the proposed methodologies address the performance limits of FL from three key aspects (i.e., computation efficiency, communication efficiency, and data heterogeneity), aligning perfectly with the aforementioned characteristics of the biomedical industry. In this project, we will implement a corneal disease classifiertrained by FL, as a real-world demonstration of our theoretical results. Digital twin technologies will also be incorporated, providing feedback on the theoretical methodologies to facilitate further improvements.
| Project number | 9043833 |
|---|---|
| Grant type | GRF |
| Status | Active |
| Effective start/end date | 1/01/26 → … |
Fingerprint
Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.