MT4MTL-KD : A Multi-Teacher Knowledge Distillation Framework for Triplet Recognition
Research output: Journal Publications and Reviews › RGC 21 - Publication in refereed journal › peer-review
Author(s)
Related Research Unit(s)
Detail(s)
Original language | English |
---|---|
Pages (from-to) | 1628-1639 |
Journal / Publication | IEEE Transactions on Medical Imaging |
Volume | 43 |
Issue number | 4 |
Online published | 21 Dec 2023 |
Publication status | Published - Apr 2024 |
Link(s)
Abstract
The recognition of surgical triplets plays a critical role in the practical application of surgical videos. It involves the sub-tasks of recognizing instruments, verbs, and targets, while establishing precise associations between them. Existing methods face two significant challenges in triplet recognition: 1) the imbalanced class distribution of surgical triplets may lead to spurious task association learning, and 2) the feature extractors cannot reconcile local and global context modeling. To overcome these challenges, this paper presents a novel multi-teacher knowledge distillation framework for multi-task triplet learning, known as MT4MTL-KD. MT4MTL-KD leverages teacher models trained on less imbalanced sub-tasks to assist multi-task student learning for triplet recognition. Moreover, we adopt different categories of backbones for the teacher and student models, facilitating the integration of local and global context modeling. To further align the semantic knowledge between the triplet task and its sub-tasks, we propose a novel feature attention module (FAM). This module utilizes attention mechanisms to assign multi-task features to specific sub-tasks. We evaluate the performance of MT4MTL-KD on both the 5-fold cross-validation and the CholecTriplet challenge splits of the CholecT45 dataset. The experimental results consistently demonstrate the superiority of our framework over state-of-the-art methods, achieving significant improvements of up to 6.4% on the cross-validation split. © 2023 IEEE.
Research Area(s)
- knowledge distillation, multi-label image classification, Surgical activity recognition
Citation Format(s)
MT4MTL-KD: A Multi-Teacher Knowledge Distillation Framework for Triplet Recognition. / Gui, Shuangchun; Wang, Zhenkun; Chen, Jixiang et al.
In: IEEE Transactions on Medical Imaging, Vol. 43, No. 4, 04.2024, p. 1628-1639.
In: IEEE Transactions on Medical Imaging, Vol. 43, No. 4, 04.2024, p. 1628-1639.
Research output: Journal Publications and Reviews › RGC 21 - Publication in refereed journal › peer-review