Facilitating Semi-Supervised Pedestrian Detection with Structurally Controllable Instance Synthesis

Tianyou Zhang, Wenhao Wu*, Si Wu*, Rui Li

*Corresponding author for this work

Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review

Abstract

The performance of pedestrian detectors typically relies on sufficient labeled data, and semi-supervised learning is a promising way to address the deficiency in manual annotations by utilizing sufficient unlabeled images. In this work, we design a Structure-Controllable Pedestrian Instance Generation approach (SCPIG), which is tailored to semi-supervised pedestrian detection. Specifically, we adopt a mask encoder to transform mask images into the embeddings encapsulating structure knowledge. In addition, we incorporate a mapping network to transform random latent code and a conditional generation network to synthesize diverse pedestrian instances, where the transformed code and mask embedding control pedestrian appearance and structure, respectively. The synthesized pedestrian instances are used to construct high-quality pseudo-labeled images for training pedestrian detectors. Extensive experiments validate the effectiveness of SCPIG in controllable pedestrian instance synthesizing and semi-supervised pedestrian detection. © 2025 IEEE.
Original languageEnglish
Title of host publicationProceedings of the 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
PublisherIEEE
Number of pages5
ISBN (Electronic)979-8-3503-6874-1
ISBN (Print)979-8-3503-6875-8
DOIs
Publication statusPublished - 2025
Event50th IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2025) - Hyderabad International Convention Centre, Hyderabad, India
Duration: 6 Apr 202511 Apr 2025
https://2025.ieeeicassp.org/

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
ISSN (Print)1520-6149
ISSN (Electronic)2379-190X

Conference

Conference50th IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2025)
Abbreviated titleICASSP2025
PlaceIndia
CityHyderabad
Period6/04/2511/04/25
Internet address

Funding

This work was supported in part by the Key Realm Research and Development Program of Guangzhou (Project No. 2024B01W0007), in part by the National Natural Science Foundation of China (Project No. 62072189), in part by the GuangDong Basic and Applied Basic Research Foundation (Project No. 2024A1515011437), and in part by TCL Science and Technology Innovation Fund (Project No. 20231752).

Research Keywords

  • Generative model
  • pedestrian generation
  • semi-supervised learning

Fingerprint

Dive into the research topics of 'Facilitating Semi-Supervised Pedestrian Detection with Structurally Controllable Instance Synthesis'. Together they form a unique fingerprint.

Cite this