GSASG : Global Sparsification With Adaptive Aggregated Stochastic Gradients for Communication-Efficient Federated Learning

Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review

View graph of relations

Author(s)

  • Runmeng Du
  • Daojing He
  • Zikang Ding
  • Miao Wang
  • Xuru Li

Related Research Unit(s)

Detail(s)

Original languageEnglish
Number of pages14
Journal / PublicationIEEE Internet of Things Journal
Publication statusOnline published - 20 May 2024

Abstract

This paper addresses the challenge of communication efficiency in federated learning by the proposed algorithm called global sparsification with adaptive aggregated stochastic gradients (GSASG). GSASG leverages the advantages of local sparse communication, global sparsification communication, and adaptive aggregated gradients. More specifically, we devise an efficient global top-k’ sparsification operator. By applying this operator to the aggregated gradients obtained from the top-k sparsification, the global model parameter is rarefied to reduce the download transmitted bits from O(dMT) to O(k’MT), where d is the dimension of the gradient, M is the number of workers, T is the total number of epochs and k’ k < d. Meanwhile, the adaptive aggregated gradient method is adopted to skip meaningless communication and reduce communication rounds. The deep neural network training experiment demonstrates that, compared to the previous algorithms, GSASG significantly reduces communication cost without sacrificing model performance. For instance, when considering the MNIST dataset with k = 1%d and k’ = 0.5%d, in terms of communication rounds, GSASG outperforms sparse communication by 91%, adaptive aggregated gradients by 90%, and the combination of sparse communication with adaptive aggregated gradients by 56%. In terms of communication bits, GSASG yields 1% of the communication bits needed with previous algorithms. © 2024 IEEE.

Research Area(s)

  • Adaptation models, adaptive gradients, Computational modeling, Convergence, Costs, distributed learning, Federated learning, Internet of Things, Servers, sparse communication, Training