Zoom in Lesions for Better Diagnosis : Attention Guided Deformation Network for WCE Image Classification

Research output: Journal Publications and Reviews (RGC: 21, 22, 62)21_Publication in refereed journalpeer-review

11 Scopus Citations
View graph of relations


Related Research Unit(s)


Original languageEnglish
Article number9143178
Pages (from-to)4047-4059
Journal / PublicationIEEE Transactions on Medical Imaging
Issue number12
Online published17 Jul 2020
Publication statusPublished - Dec 2020


Wireless capsule endoscopy (WCE) is a novel imaging tool that allows noninvasive visualization of the entire gastrointestinal (GI) tract without causing discomfort to patients. Convolutional neural networks (CNNs), though perform favorably against traditional machine learning methods, show limited capacity in WCE image classification due to the small lesions and background interference. To overcome these limits, we propose a two-branch Attention Guided Deformation Network (AGDN) for WCE image classification. Specifically, the attention maps of branch1 are utilized to guide the amplification of lesion regions on the input images of branch2, thus leading to better representation and inspection of the small lesions. What's more, we devise and insert Third-order Long-range Feature Aggregation (TLFA) modules into the network. By capturing long-range dependencies and aggregating contextual features, TLFAs endow the network with a global contextual view and stronger feature representation and discrimination capability. Furthermore, we propose a novel Deformation based Attention Consistency (DAC) loss to refine the attention maps and achieve the mutual promotion of the two branches. Finally, the global feature embeddings from the two branches are fused to make image label predictions. Extensive experiments show that the proposed AGDN outperforms state-of-the-art methods with an overall classification accuracy of 91.29% on two public WCE datasets. The source code is available at https://github.com/hathawayxxh/WCE-AGDN.

Research Area(s)

  • attention consistency, attention guided image deformation, image classification, long-range feature aggregation, Wireless capsule endoscopy