DarkDistill: Difficulty-Aligned Federated Early-Exit Network Training on Heterogeneous Devices

Lehao Qu (Co-first Author), Shuyuan Li (Co-first Author), Zimu Zhou, Boyi Liu, Yi Xu*, Yongxin Tong*

*Corresponding author for this work

Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review

Abstract

Early-exit networks (EENs), which adapt their computational depths based on input samples, are widely adopted to accelerate inference in edge computing applications. The effectiveness of EENs relies on difficulty-aware training, which tailors shallow exits for simple samples and deep exits for complex ones. However, existing difficulty-aware training schemes assume centralized environments with sufficient data, which become invalid with real-world edge devices. In this paper, we explore difficulty-aware training in a federated manner, where EENs are collaboratively trained on heterogeneous devices. We observe the cross-model exit unalignment phenomenon, a unique problem when aggregating local EENs into a cohesive global model. To address this problem, we design a novel Difficulty-Aligned Reverse Knowledge Distillation scheme named DarkDistill that preserves the difficulty-specific specialization for aggregating heterogeneous local models. Instead of direct parameter averaging, it trains difficulty-conditional data generators, and selectively transfers generated knowledge of specific difficulty among matched exits of heterogeneous EENs. Evaluations show that DarkDistill outperforms the state-of-the-arts in both full-parameter and parameter-efficient fine-tuning of EENs. © 2025 Copyright held by the owner/author(s).
Original languageEnglish
Title of host publicationKDD ’25
Subtitle of host publicationProceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining
PublisherAssociation for Computing Machinery
Pages2374-2385
Number of pages12
Volume2
ISBN (Print)9798400714542
DOIs
Publication statusPublished - 3 Aug 2025
Event31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2025) - Toronto, Canada
Duration: 3 Aug 20257 Aug 2025
https://kdd2025.kdd.org/
https://dl.acm.org/conference/kdd/proceedings

Publication series

NameProceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
ISSN (Print)2154-817X

Conference

Conference31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2025)
PlaceCanada
CityToronto
Period3/08/257/08/25
Internet address

Funding

This work was partially supported by National Key Research and Development Program of China under Grant No. 2023YFF0725103, National Science Foundation of China (NSFC) (Grant Nos. 62425202, U21A20516, 62336003), the CityU APRC grant (No. 9610633), the Beijing Natural Science Foundation (Z230001), the Fundamental Research Funds for the Central Universities No. JK2024-03, the Didi Collaborative Research Program and the State Key Laboratory of Complex & Critical Software Environment (SKLCCSE).

Research Keywords

  • early-exit networks
  • federated learning on heterogeneous devices
  • knowledge distillation

Fingerprint

Dive into the research topics of 'DarkDistill: Difficulty-Aligned Federated Early-Exit Network Training on Heterogeneous Devices'. Together they form a unique fingerprint.

Cite this