Data Source Selection in Federated Learning : A Submodular Optimization Approach

Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review

4 Scopus Citations
View graph of relations

Author(s)

  • Ruisheng Zhang
  • Yansheng Wang
  • Ziyao Ren
  • Yongxin Tong
  • Ke Xu

Detail(s)

Original languageEnglish
Title of host publicationDatabase Systems for Advanced Applications
Subtitle of host publication27th International Conference, DASFAA 2022, Virtual Event, April 11–14, 2022, Proceedings, Part II
EditorsArnab Bhattacharya, Janice Lee, Mong Li, Divyakant Agrawal, Krishna Reddy, Mukesh Mohania, Anirban Mondal, Vikram Goyal, Rage Uday Kiran
Place of PublicationCham
PublisherSpringer 
Pages606-614
VolumePart II
ISBN (electronic)978-3-031-00126-0
ISBN (print)9783031001253
Publication statusPublished - 2022
Externally publishedYes

Publication series

NameLecture Notes in Computer Science
Volume13246
ISSN (Print)0302-9743
ISSN (electronic)1611-3349

Conference

Title27th International Conference on Database Systems for Advanced Applications (DASFAA-2022)
LocationVirtual
PlaceIndia
CityHyderabad
Period11 - 14 April 2022

Abstract

Federated learning is a new learning paradigm that jointly trains a model from multiple data sources without sharing raw data. For the practical deployment of federated learning, data source selection is compulsory due to the limited communication cost and budget in real-world applications. The necessity of data source selection is further amplified in presence of data heterogeneity among clients. Prior solutions are either low in efficiency with exponential time cost or lack theoretical guarantees. Inspired by the diminishing marginal accuracy phenomenon in federated learning, we study the problem from the perspective of submodular optimization. In this paper, we aim at efficient data source selection with theoretical guarantees. We prove that data source selection in federated learning is a monotone submodular maximization problem and propose FDSS, an efficient algorithm with a constant approximate ratio. Furthermore, we extend FDSS to FDSS-d for dynamic data source selection. Extensive experiments on CIFAR10 and CIFAR100 validate the efficiency and effectiveness of our algorithms. © 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG.

Research Area(s)

  • Data source selection, Federated learning, Submodularity

Citation Format(s)

Data Source Selection in Federated Learning: A Submodular Optimization Approach. / Zhang, Ruisheng; Wang, Yansheng; Zhou, Zimu et al.
Database Systems for Advanced Applications: 27th International Conference, DASFAA 2022, Virtual Event, April 11–14, 2022, Proceedings, Part II. ed. / Arnab Bhattacharya; Janice Lee; Mong Li; Divyakant Agrawal; Krishna Reddy; Mukesh Mohania; Anirban Mondal; Vikram Goyal; Rage Uday Kiran. Vol. Part II Cham: Springer , 2022. p. 606-614 (Lecture Notes in Computer Science; Vol. 13246).

Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review