Towards Intelligent Medical Image Diagnosis: Exploration on Imperfect Data

邁向智能醫學圖像診斷: 不完美數據的探索

Student thesis: Doctoral Thesis

View graph of relations


Related Research Unit(s)


Awarding Institution
Award date18 Feb 2022


Recent years have witnessed the superior performance of deep learning techniques in the field of medical image diagnosis. The tailored deep learning methods effectively learn diagnostic knowledge from annotated medical datasets, which have been applied in clinical practice to alleviate the workload of physicians. However, this paradigm relies heavily on the collection of large quantities of high-quality medical data with exhaustive manual annotations. In real-world applications, the imperfect data caused by various factors would significantly affect the performance of diagnostic algorithms. In this thesis, we focus on several challenges of imperfect data, including resolution degradation, weak supervision and data decentralization. Specifically, (1) The access to expensive high-end imaging equipment is limited in remote and impoverished areas where medical images generally have the inferior spatial resolution. The resolution degradation of medical images could result in loss of diagnostic clues and interfere with the diagnosis for both experts and algorithms. (2) The detailed manual annotations are impractical for specific types of medical data, e.g., histopathologic whole-slide image (WSI) with gigapixels and computed tomography (CT) scan with hundreds of slices. Most existing diagnosis methods targeted at full supervision are inapplicable in such weakly-supervised scenarios with merely coarse labels. (3) The privacy and ethical concerns are increasingly critical, making it difficult to collect large quantities of data from multiple institutions. Compared with the traditional centralized paradigm, developing the future intelligent healthcare system in a decentralized manner is promising.

In this thesis, we present a series of deep learning techniques to handle these aforementioned challenges from different aspects. First, to tackle the over-smoothness of existing super-resolution (SR) works, we aim to explicitly reconstruct the high-frequency details in the wavelet domain. Specifically, we propose a Spatial-Wavelet Dual-stream Network (SWD-Net) integrated with Refined Context Fusion (RCF). To fully exploit the capability of networks in the wavelet domain, we design the Wavelet Features Adaptation (WFA) to adjust the wavelet coefficients into an appropriate range and Wavelet-Aware Convolutional blocks (WAC) to efficiently extract contextual information in the wavelet domain. Supervised by the spatial loss and wavelet loss, SWD-Net can recover high-resolution images with clear structural boundaries.

Second, to promote the diagnosis performance with inferred high-frequency details from LR images, we present a SR enhanced diagnosis framework, consisting of an efficient SR network and a diagnosis network. Instead of learning from scratch, we propose a recursion distillation scheme to optimize the efficient SR network with the knowledge of temporal context. The diagnosis network jointly utilizes the reliable original images and more informative SR images by two branches, with the proposed Sample Affinity Interaction (SAI) blocks at different stages to effectively extract and integrate discriminative features towards diagnosis. Moreover, two devised constraints, sample affinity consistency and sample affinity regularization, are devised to refine the features and achieve the mutual promotion of these two branches.

Third, we develop diagnosis frameworks under the weak supervision of coarse labels for WSI and 3D medical data. For the extremely-large WSI, we propose a Pathologist-Tree Network (PTree-Net) to sparsely model the WSI efficiently in a multi-scale manner, which follows the fact that pathologists jointly analyze visual fields at multiple powers of objective for diagnostic predictions. For the diagnosis of 3D medical data, we propose the Importance-aware multi-Instance Graph Convolutional Network (I2GCN), by considering the fact that 2D slices of the 3D medical data hold explicit diagnostic efficacy. With the diagnostic importance of each instance calculated using a preliminary classifier, the I2GCN can extract discriminative features in the topology space of both instance importance and semantic features.

Lastly, Federated Learning (FL) provides a decentralized solution to train models collaboratively without exchanging private data. But applying FL in the real-world medical scenarios would encounter two challenges of retrogress and class imbalance. To tackle these two problems, we propose a personalized retrogress-resilient FL framework. For the retrogress problem, we devise a Progressive Fourier Aggregation (PFA) at the server to integrate the global knowledge of client models in the frequency domain. Then, with an introduced deputy model to receive the aggregated server model, we design a Deputy-Enhanced Transfer (DET) strategy at the client to smoothly improve the personalized local model with the global knowledge. For the class imbalance problem, we propose the Conjoint Prototype-Aligned (CPA) loss to facilitate the balanced optimization of the FL framework. Considering the inaccessibility of private local data in FL, the CPA loss calculates the global conjoint objective based on global imbalance, and then adjusts the client-side local training through the prototype-aligned refinement to eliminate the imbalance gap with such a balanced goal.