Towards Annotation-efficient Deep Learning for Automated Medical Image Analysis


Student thesis: Doctoral Thesis

View graph of relations


Related Research Unit(s)


Awarding Institution
Award date24 Aug 2022


Medical image analysis, such as medical image classification and segmentation, can provide quantitative assessments for various healthcare applications, including disease diagnosis, treatment planning, surgical navigation, and prognosis monitoring. With the global expansion of medical imaging and advances in imaging technology, the amount of acquired medical image data is growing at a pace much faster than existing human experts can interpret, thus necessitating automated medical image analysis algorithms to assist clinicians in accurate and real-time image-based analysis.

Whereas deep learning, as an efficient way, has made considerable progress in automated medical image analysis, the generalizability of deep learning-based models in clinical deployments is limited due to the heavy reliance on large amounts of high-quality annotated training data. In practice, since annotating medical images is extremely expensive and labor-intensive, the available training dataset is probably equipped with scarce labels, composed of massive unlabeled data, exhibiting evident domain shifts, annotated with label noises, etc. These imperfect characteristics in training dataset emphasize the growing demand for developing annotation-efficient deep learning algorithms. This thesis seeks annotation-efficient deep learning solutions to enable generalizable and robust learning from imperfectly annotated training data for automatic medical image analysis.

In the first part, we propose a confidence-guided manifold mixup (CGMMix) data augmentation method to enrich limited labeled data with confidence guidance at both image and feature levels. The design of confidence guidance integrates the clinical consideration to guarantee the newly generated image-label pairs are clinical meaningful. Consequently, the proposed CGMMix method augments the limited labeled data and promote the generalizability of optimized deep models on scarce annotated dataset.

In the second part, considering unlabeled medical images are abundant and can be easily obtained in clinical, we are the first to investigate semi-supervised learning in the diagnosis of Wireless Capsule Endoscopy (WCE) images and propose a synergic network structure with two branches. By encouraging the consistency of outputs from two branches for the same input, the proposed method fully exploits both limited labeled data and extensive unlabeled data.

In the third part, taking advantages of data augmentation and semi-supervised learning methods, we further propose a new strategy, coined Labeled-to-unlabeled Distribution Translation, to translate labeled data distribution with the guidance of extensive unlabeled data. The translated labeled data distribution and unlabeled data distribution can approximate the desired data distribution, which alleviates the scarce labeled data distribution bias problem and contributes to an accurate decision boundary.

In the fourth part, we transfer the knowledge learned from virtually synthesized labeled data, public labeled data, or labeled data from other medical centers (source domain data) to our unlabeled target domain data. A domain-aware meta-learning strategy is proposed to guide the optimization of target domain data with meta-knowledge, which is the underlying label distribution of clean data learned from source domain data.

In the fifth part, considering the inefficient transmission and privacy issues of source domain data, we use a black-box source model to generate pseudo labels for our unlabeled target domain data. Due to the domain discrepancy and label shift, pseudo-labeled target data contains mixed closed-set and open-set label noises, and thus we propose a simplex noise transition matrix to model the mixed noise label distributions for purifying the transferred knowledge (i.e., pseudo labels).

In the last part, with limited annotation cost, the resulting training data would contain label noises, preventing models from learning precise semantic correlations. To combat label noise issues in medical image segmentation, we propose a robust Joint Class-Affinity Segmentation (JCAS) framework based on the observation that the pair-wise manner capturing affinity relations between pixels can greatly reduce the label noise rate. Unifying pixel-wise and pair-wise manners, JCAS corrects supervisions derived from noisy class and affinity labels to enhance the noise resistance.