Skip to main navigation Skip to search Skip to main content

MKCNet: Learning Multi-Granularity Key Cues for FGVC for Intelligent Monitoring Robot via Transformers

  • Hai Liu
  • , Yu Song
  • , Tingting Liu
  • , Li Liu
  • , Zhaoli Zhang
  • , You-Fu Li

Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review

Abstract

FGVC plays a major role in the field of robotic vision, which has a significant impact, for example, on the conservation of endangered and rare birds, It aims to distinguish similar subcategories. In this paper, we propose a novel Multi-Granularity Key Cue Learning Network (MKCNet) for fine-grained visual classification. First, we design a multi-granularity feature extraction module to capture local information at different scales and enhance high-level feature representations with discriminative local features across different granularities. Next, to extract partial features of different shapes at each granularity, we introduce a partial sampling attention mechanism that comprehensively samples implicit semantic parts on the feature map. This partial sampling not only takes into account the importance of sample localization, but also applies local loss to mitigate the overfitting problem. Through extensive experimental validation on the CUB-200-2011 and NABirds datasets, our approach demonstrates superior performance compared to current top techniques. © 2024 IEEE.
Original languageEnglish
Title of host publication2024 9th International Conference on Robotics and Automation Engineering (ICRAE 2024)
PublisherIEEE
Pages158-163
ISBN (Electronic)9798331518301
ISBN (Print)9798331518318
DOIs
Publication statusPublished - Nov 2024
Event9th International Conference on Robotics and Automation Engineering (ICRAE 2024) - Hybrid, Singapore
Duration: 15 Nov 202417 Nov 2024
https://www.icrae.org/2024.html

Publication series

NameInternational Conference on Robotics and Automation Engineering, ICRAE

Conference

Conference9th International Conference on Robotics and Automation Engineering (ICRAE 2024)
PlaceSingapore
Period15/11/2417/11/24
Internet address

Funding

This work was supported in part by the National Key Research and Development Program of China under Grant 2021YFC3340802; in part by the National Natural Science Foundation of China under Grant 6247077114, Grant 62377037, Grant 62277041, Grant 62173286, Grant 62177019 and Grant 62177018; and in part by the Research Grants Council of Hong Kong under Grant 9043323, and Grant 11213420; in part by the Jiangxi Provincial Natural Science Foundation under Grant 20242BAB2S107, Grant 20232BAB212026; in part by the National Natural Science Foundation of Hubei Province under Grant 2022CFB529 and Grant 2022CFB971; in part by the University Teaching Reform Research Project of Jiangxi Province under Grant JXJG-23-27- 6; and in part by the Shenzhen Science and Technology Program under Grant JCYJ20230807152900001.

Research Keywords

  • Computer Vision
  • Image classification
  • Robotic vision
  • Vision Transformer

RGC Funding Information

  • RGC-funded

Fingerprint

Dive into the research topics of 'MKCNet: Learning Multi-Granularity Key Cues for FGVC for Intelligent Monitoring Robot via Transformers'. Together they form a unique fingerprint.

Cite this