Abstract
Current knowledge distillation methods for semantic segmentation are primarily designed for knowledge transfer within homogeneous networks, and are less effective for heterogeneous networks. The feature information output by heterogeneous networks faces several challenges, including differences in feature scales and varying capabilities to represent local and global contextual information. To address these issues, we propose a novel Heterogeneous Model Knowledge Distillation (HMKD) framework using a dual alignment method to improve distillation performance between CNN-based and Transformer-based semantic segmentation models. Specifically, we introduce the Patch-based Self-attention Alignment Module (PSAM), which computes and aligns patch-level self-attention across distinct feature map spaces, enabling the transfer of local or global contextual information between heterogeneous models. Additionally, the Heterogeneous Scale Alignment Module (HSAM) is designed to ensure consistency across heterogeneous feature scales and enrich the semantic content. We have also conducted extensive experiments on two benchmark datasets (Cityscapes and CamVid) to validate the effectiveness and superiority of our approach compared with several recent state-of-the-art (SOTA) methods. Our code is deposited at https://github.com/xumingzhu989/HMKD-ICMR. © 2025 ACM.
| Original language | English |
|---|---|
| Title of host publication | ICMR '25: Proceedings of the 2025 International Conference on Multimedia Retrieval |
| Publisher | Association for Computing Machinery |
| Pages | 1635-1643 |
| ISBN (Print) | 9798400718779 |
| DOIs | |
| Publication status | Published - 30 Jun 2025 |
| Event | 2025 International Conference on Multimedia Retrieval (ICMR 2025) - Chicago, United States Duration: 30 Jun 2025 → 3 Jul 2025 |
Publication series
| Name | ICMR - Proceedings of the International Conference on Multimedia Retrieval |
|---|
Conference
| Conference | 2025 International Conference on Multimedia Retrieval (ICMR 2025) |
|---|---|
| Place | United States |
| City | Chicago |
| Period | 30/06/25 → 3/07/25 |
Funding
This work has been supported by the National Natural Science Foundation of China (No. 62206157, No. 62376137), the Natural Science Foundation of Shandong Province (No. ZR2022QF047, No. ZR2022YQ59), the Key R & D Program of Shandong Province, China (Major Scientific and Technological Innovation Projects) (No. 2022CXGC020107).
Research Keywords
- feature alignment
- heterogeneous network
- knowledge distillation
- semantic segmentation
Fingerprint
Dive into the research topics of 'Heterogeneous Model Knowledge Distillation via Dual Alignment for Semantic Segmentation'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver