Abstract
Early-exit networks (EENs), which adapt their computational depths based on input samples, are widely adopted to accelerate inference in edge computing applications. The effectiveness of EENs relies on difficulty-aware training, which tailors shallow exits for simple samples and deep exits for complex ones. However, existing difficulty-aware training schemes assume centralized environments with sufficient data, which become invalid with real-world edge devices. In this paper, we explore difficulty-aware training in a federated manner, where EENs are collaboratively trained on heterogeneous devices. We observe the cross-model exit unalignment phenomenon, a unique problem when aggregating local EENs into a cohesive global model. To address this problem, we design a novel Difficulty-Aligned Reverse Knowledge Distillation scheme named DarkDistill that preserves the difficulty-specific specialization for aggregating heterogeneous local models. Instead of direct parameter averaging, it trains difficulty-conditional data generators, and selectively transfers generated knowledge of specific difficulty among matched exits of heterogeneous EENs. Evaluations show that DarkDistill outperforms the state-of-the-arts in both full-parameter and parameter-efficient fine-tuning of EENs. © 2025 Copyright held by the owner/author(s).
| Original language | English |
|---|---|
| Title of host publication | KDD ’25 |
| Subtitle of host publication | Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining |
| Publisher | Association for Computing Machinery |
| Pages | 2374-2385 |
| Number of pages | 12 |
| Volume | 2 |
| ISBN (Print) | 9798400714542 |
| DOIs | |
| Publication status | Published - 3 Aug 2025 |
| Event | 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2025) - Toronto, Canada Duration: 3 Aug 2025 → 7 Aug 2025 https://kdd2025.kdd.org/ https://dl.acm.org/conference/kdd/proceedings |
Publication series
| Name | Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining |
|---|---|
| ISSN (Print) | 2154-817X |
Conference
| Conference | 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2025) |
|---|---|
| Place | Canada |
| City | Toronto |
| Period | 3/08/25 → 7/08/25 |
| Internet address |
Funding
This work was partially supported by National Key Research and Development Program of China under Grant No. 2023YFF0725103, National Science Foundation of China (NSFC) (Grant Nos. 62425202, U21A20516, 62336003), the CityU APRC grant (No. 9610633), the Beijing Natural Science Foundation (Z230001), the Fundamental Research Funds for the Central Universities No. JK2024-03, the Didi Collaborative Research Program and the State Key Laboratory of Complex & Critical Software Environment (SKLCCSE).
Research Keywords
- early-exit networks
- federated learning on heterogeneous devices
- knowledge distillation