Abstract
Medical image analysis has experienced remarkable advancements in disease diagnosis, treatment planning, surgical navigation, and prognosis monitoring. The rapid evolution of biomedical imaging technologies has driven the need for efficient deep learning techniques to automate image analysis, aiding clinicians and medical robots in precise and real-time image-based analyses.
Despite notable advancements, the practical application of deep learning in clinical environments is often limited by the need for extensive annotated training data and the computational intensity of existing models. Additionally, considerable research overlooks the importance of analysing small objects, occupying less than 10% area ratio, which is crucial for early-stage disease detection. This research aims to overcome these challenges by developing innovative mixed sample data augmentation techniques, light-weight neural networks, and methods for analysing small medical objects, enhancing the efficiency and effectiveness of medical image analysis with limited annotated data for real-time medical robotic applications.
Mixed sample data augmentation (MSDA) methods enhance data variability and model generalisability without burdening the inference process, as they require computational resources only during training. Previous MSDA methods, such as Cutout, CutMix, and CutBlur, treated all regions within an image as equally important. This research proposes two novel data augmentation techniques, rotary-cutting mixture (RotMix) and saliency mixture (SaliencyMix), to enrich the forms of data mixing. RotMix employs relative rotation between two cropped patches to diversify data.
SaliencyMix utilises the minimum barrier distance transform to assess pixel connectivity, emphasising undervalued regions.
Furthermore, while ensemble convolutional neural networks (CNNs) have been used to improve classification accuracy, their application, such as skin lesion recognition, consumes significant computational resources. This research develops HierAttn, a light-weight, hierarchical attention-based network optimised for medical image classification. HierAttn incorporates multi-stage and multi-branch attention mechanisms to leverage multi-scale features with a minor increase in computational load.
Additionally, convolution and pooling operations in traditional CNNs negatively affect information retention and introduce compression issues. This research introduces SvANet, a scale-variant attention-based network designed to enhance the representation of small-scale medical objects with less than 10% area ratio. SvANet integrates progressively compressed and high-resolution features from the initial stages, supported by cross-scale guidance and scale-variant attention.
To capture diverse feature scales, SvANet also incorporates Monte Carlo attention and tensor-assembly-based convolution with transformer.
Moreover, this research applies these deep learning techniques to microrobotic manipulation for in vitro fertilisation, eliminating the need for invasive straining techniques and enabling non-invasive segmentation and detection of sperm.
Lastly, this research presents MobileViM, an architecture tailored for efficient 3D medical image segmentation. Leveraging a vision-Mamba-based framework, MobileViM features a dimension-independent mechanism, dual-direction information flow, and cross-scale bridging, achieving segmentation speeds over 90 FPS on a single RTX 4090 unit and surpassing existing models in performance metrics. This capability is particularly beneficial in clinical surgeries requiring real-time 3D image examination.
In summary, this research significantly advances automated biomedical image analysis, enhancing the potential for early diagnosis and treatment and providing solutions for medical robotics.
By addressing deep learning limitations in medical image analysis, this research facilitates the development of efficient, accurate, and practical techniques for clinical deployment.
| Date of Award | 25 Aug 2025 |
|---|---|
| Original language | English |
| Awarding Institution |
|
| Supervisor | Steven WANG (Supervisor) & Jun Liu (External Co-Supervisor) |
Cite this
- Standard