Fast Multimodal Edge Inference via Selective Feature Distillation

Jinyu Chen, Wenchao Xu*, Yunfeng Fan, Haozhao Wang*, Quan Chen, Jing Li

*Corresponding author for this work

Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review

Abstract

Inferring user status at the edge is essential for delivering personalized services, such as detecting emotional states. However, deploying large-scale models directly on user devices is impractical due to substantial computational overhead and the scarcity of labeled data. Conversely, uploading raw data to the cloud for processing raises significant privacy concerns and incurs prohibitive communication costs. To address this challenge, we propose a privacy-preserving multimodal inference framework that leverages large-scale public data while safeguarding sensitive information and optimizing computational efficiency. Specifically, we first train a teacher model in the cloud using publicly available data. Through a feature distillation process, the knowledge from this teacher model is transferred to a lightweight encoder deployed at the user end. This transfer is tailored to the user's data, ensuring that only relevant knowledge is distilled. To accommodate varying communication constraints, we introduce a feature compression mechanism that significantly reduces communication overhead without compromising inference accuracy. Extensive experiments on emotion recognition tasks demonstrate that the proposed framework effectively balances privacy preservation, resource efficiency, and inference accuracy, facilitating seamless collaboration between cloud and edge devices. © 2025 IEEE.
Original languageEnglish
Pages (from-to)11337-11350
JournalIEEE Transactions on Mobile Computing
Volume24
Issue number11
Online published23 Jun 2025
DOIs
Publication statusPublished - Nov 2025

Funding

This work was supported in part by the National Natural Science Foundation of China under Grant 62302184 and Grant 62372118, in part by the Guangdong Basic and Applied Basic Research Foundation under Grant 2024A1515030136, in part by the Post-Doc Fellowship Award from the Hong Kong RGC under Project under Grant CityU PDFS2425-1S02, and in part by the Research Grants Council of the Hong Kong Special Administrative Region, China under Grant PolyU15222621, and Grant PolyU15225023.

Research Keywords

  • Data models
  • Feature extraction
  • Training
  • Semantic communication
  • Privacy
  • Image edge detection
  • Computational modeling
  • Accuracy
  • Synchronization
  • Mobile computing
  • Cloud-edge collaborative inference
  • multimodal inference
  • knowledge distillation

RGC Funding Information

  • RGC-funded

Fingerprint

Dive into the research topics of 'Fast Multimodal Edge Inference via Selective Feature Distillation'. Together they form a unique fingerprint.

Cite this