Skip to main navigation Skip to search Skip to main content

Heterogeneous Model Knowledge Distillation via Dual Alignment for Semantic Segmentation

Mingzhu Xu, Jing Wang, Mingcai Wang, Yiping Li, Yupeng Hu, Xuemeng Song*, Weili Guan

*Corresponding author for this work

Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review

Abstract

Current knowledge distillation methods for semantic segmentation are primarily designed for knowledge transfer within homogeneous networks, and are less effective for heterogeneous networks. The feature information output by heterogeneous networks faces several challenges, including differences in feature scales and varying capabilities to represent local and global contextual information. To address these issues, we propose a novel Heterogeneous Model Knowledge Distillation (HMKD) framework using a dual alignment method to improve distillation performance between CNN-based and Transformer-based semantic segmentation models. Specifically, we introduce the Patch-based Self-attention Alignment Module (PSAM), which computes and aligns patch-level self-attention across distinct feature map spaces, enabling the transfer of local or global contextual information between heterogeneous models. Additionally, the Heterogeneous Scale Alignment Module (HSAM) is designed to ensure consistency across heterogeneous feature scales and enrich the semantic content. We have also conducted extensive experiments on two benchmark datasets (Cityscapes and CamVid) to validate the effectiveness and superiority of our approach compared with several recent state-of-the-art (SOTA) methods. Our code is deposited at https://github.com/xumingzhu989/HMKD-ICMR. © 2025 ACM.
Original languageEnglish
Title of host publicationICMR '25: Proceedings of the 2025 International Conference on Multimedia Retrieval
PublisherAssociation for Computing Machinery
Pages1635-1643
ISBN (Print)9798400718779
DOIs
Publication statusPublished - 30 Jun 2025
Event2025 International Conference on Multimedia Retrieval (ICMR 2025) - Chicago, United States
Duration: 30 Jun 20253 Jul 2025

Publication series

NameICMR - Proceedings of the International Conference on Multimedia Retrieval

Conference

Conference2025 International Conference on Multimedia Retrieval (ICMR 2025)
PlaceUnited States
CityChicago
Period30/06/253/07/25

Funding

This work has been supported by the National Natural Science Foundation of China (No. 62206157, No. 62376137), the Natural Science Foundation of Shandong Province (No. ZR2022QF047, No. ZR2022YQ59), the Key R & D Program of Shandong Province, China (Major Scientific and Technological Innovation Projects) (No. 2022CXGC020107).

Research Keywords

  • feature alignment
  • heterogeneous network
  • knowledge distillation
  • semantic segmentation

Fingerprint

Dive into the research topics of 'Heterogeneous Model Knowledge Distillation via Dual Alignment for Semantic Segmentation'. Together they form a unique fingerprint.

Cite this