MT4MTL-KD: A Multi-Teacher Knowledge Distillation Framework for Triplet Recognition

Shuangchun Gui, Zhenkun Wang*, Jixiang Chen, Xun Zhou, Chen Zhang, Yi Cao

*Corresponding author for this work

Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review

Abstract

The recognition of surgical triplets plays a critical role in the practical application of surgical videos. It involves the sub-tasks of recognizing instruments, verbs, and targets, while establishing precise associations between them. Existing methods face two significant challenges in triplet recognition: 1) the imbalanced class distribution of surgical triplets may lead to spurious task association learning, and 2) the feature extractors cannot reconcile local and global context modeling. To overcome these challenges, this paper presents a novel multi-teacher knowledge distillation framework for multi-task triplet learning, known as MT4MTL-KD. MT4MTL-KD leverages teacher models trained on less imbalanced sub-tasks to assist multi-task student learning for triplet recognition. Moreover, we adopt different categories of backbones for the teacher and student models, facilitating the integration of local and global context modeling. To further align the semantic knowledge between the triplet task and its sub-tasks, we propose a novel feature attention module (FAM). This module utilizes attention mechanisms to assign multi-task features to specific sub-tasks. We evaluate the performance of MT4MTL-KD on both the 5-fold cross-validation and the CholecTriplet challenge splits of the CholecT45 dataset. The experimental results consistently demonstrate the superiority of our framework over state-of-the-art methods, achieving significant improvements of up to 6.4% on the cross-validation split. © 2023 IEEE.
Original languageEnglish
Pages (from-to)1628-1639
JournalIEEE Transactions on Medical Imaging
Volume43
Issue number4
Online published21 Dec 2023
DOIs
Publication statusPublished - Apr 2024

Research Keywords

  • knowledge distillation
  • multi-label image classification
  • Surgical activity recognition

Fingerprint

Dive into the research topics of 'MT4MTL-KD: A Multi-Teacher Knowledge Distillation Framework for Triplet Recognition'. Together they form a unique fingerprint.

Cite this