Skip to main navigation Skip to search Skip to main content

Pixel-Level Semantics Boosted Fine-Grained Bird Image Classification

Haoxiang Ma, Yongjian Deng*, Bochen Xie, Jian Liu, Hai Liu, Youfu Li, Zhen Yang*

*Corresponding author for this work

Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review

Abstract

Fine-grained bird image classification (FBIC) is crucial for endangered bird conservation and biodiversity research. However, existing methods often struggle to capture detailed features and manage the interference caused by complex backgrounds. To address these challenges, we propose a novel Pixel-Level Semantic Boosted Fine-Grained Bird Image Classification (PFIC) framework, which enhances fine-grained bird image classification by incorporating pixel-level semantic information. PFIC consists of two core components: the Grouped Detail Enhancement (GDE) module and the Background–Foreground Enhancement (BFE) strategy. GDE integrates multi-level pixel-level semantic information, derived from a segmentation feature extractor, into classification features via two submodules: grouped aggregation and detail enhancement. This approach enhances the model's ability to capture fine-grained details. BFE augments training samples by restricting background ranges and applying random shifts to foreground objects, thereby improving the model's capability to recognize foreground objects in complex environments. Experimental results demonstrate that our method achieves state-of-the-art performance on the CUB-200-2011 and NABirds datasets. Additionally, further experiments on the Stanford Cars dataset validate the framework's potential for generalization to other fine-grained image classification tasks. © 2025 Elsevier Ltd
Original languageEnglish
Article number112089
Number of pages14
JournalEngineering Applications of Artificial Intelligence
Volume161
Issue numberPart B
Online published5 Sept 2025
DOIs
Publication statusPublished - 9 Dec 2025

Funding

This work is jointly supported by National Natural Science Foundation of China ( 62203024 ), Beijing Natural Science Foundation ( 4252026 ), Research and Development Program of Beijing Municipal Education Commission ( KM202310005027 ), National Key Research and Development Program of China ( 2022YFF0610000 ).

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

  1. SDG 15 - Life on Land
    SDG 15 Life on Land

Research Keywords

  • Image classification
  • Image segmentation
  • Pixel-level semantic information
  • Vision transformer

Fingerprint

Dive into the research topics of 'Pixel-Level Semantics Boosted Fine-Grained Bird Image Classification'. Together they form a unique fingerprint.

Cite this