Abstract
Fine-grained bird image classification (FBIC) is crucial for endangered bird conservation and biodiversity research. However, existing methods often struggle to capture detailed features and manage the interference caused by complex backgrounds. To address these challenges, we propose a novel Pixel-Level Semantic Boosted Fine-Grained Bird Image Classification (PFIC) framework, which enhances fine-grained bird image classification by incorporating pixel-level semantic information. PFIC consists of two core components: the Grouped Detail Enhancement (GDE) module and the Background–Foreground Enhancement (BFE) strategy. GDE integrates multi-level pixel-level semantic information, derived from a segmentation feature extractor, into classification features via two submodules: grouped aggregation and detail enhancement. This approach enhances the model's ability to capture fine-grained details. BFE augments training samples by restricting background ranges and applying random shifts to foreground objects, thereby improving the model's capability to recognize foreground objects in complex environments. Experimental results demonstrate that our method achieves state-of-the-art performance on the CUB-200-2011 and NABirds datasets. Additionally, further experiments on the Stanford Cars dataset validate the framework's potential for generalization to other fine-grained image classification tasks. © 2025 Elsevier Ltd
| Original language | English |
|---|---|
| Article number | 112089 |
| Number of pages | 14 |
| Journal | Engineering Applications of Artificial Intelligence |
| Volume | 161 |
| Issue number | Part B |
| Online published | 5 Sept 2025 |
| DOIs | |
| Publication status | Published - 9 Dec 2025 |
Funding
This work is jointly supported by National Natural Science Foundation of China ( 62203024 ), Beijing Natural Science Foundation ( 4252026 ), Research and Development Program of Beijing Municipal Education Commission ( KM202310005027 ), National Key Research and Development Program of China ( 2022YFF0610000 ).
UN SDGs
This output contributes to the following UN Sustainable Development Goals (SDGs)
-
SDG 15 Life on Land
Research Keywords
- Image classification
- Image segmentation
- Pixel-level semantic information
- Vision transformer
Fingerprint
Dive into the research topics of 'Pixel-Level Semantics Boosted Fine-Grained Bird Image Classification'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver