TY - JOUR
T1 - Controllable instance synthesis with hierarchical regularization for semi-supervised pedestrian detection
AU - Chen, Tianyi
AU - Li, Gaozhe
AU - Sai, Xiangyu
AU - Xie, Lianxin
AU - Zhang, Yunfei
AU - Wu, Si
PY - 2025/2/14
Y1 - 2025/2/14
N2 - Generative Adversarial Network (GAN)-based pedestrian image synthesis opens up the possibility of using synthesized data with sufficient diversity to cover unseen variations, which facilitates semi-supervised pedestrian detection. To improve synthesized image quality and diversity in the semi-supervised setting, we propose a Hierarchical Regularized GAN (HiR-GAN), which allows a controllable instance generation process. In the proposed model, a pedestrian instance is synthesized and stitched on top of a background image. By imposing shape-based regularization on intermediate results and complete pedestrian images, a generator learns to disentangle pedestrian shape and appearance without any supervision. As a result, we achieve effective control of background, shape, and appearance via explicit latent codes. Further, the generator competes with two discriminators, which focus more on pedestrian shape and image quality, respectively. By jointly training a background-pedestrian classifier, the generator is guided to capture more precise semantics of real pedestrian data, while at the same time the classifier is able to produce more accurate annotations of unlabeled data. Experiment results on multiple benchmarks demonstrate the superiority of HiR-GAN over previous state-of-the-arts in semi-supervised pedestrian image synthesis and the effectiveness of high-fidelity synthesized data in reducing the dependence on labeled data in the pedestrian detection task. © 2024 Elsevier B.V.
AB - Generative Adversarial Network (GAN)-based pedestrian image synthesis opens up the possibility of using synthesized data with sufficient diversity to cover unseen variations, which facilitates semi-supervised pedestrian detection. To improve synthesized image quality and diversity in the semi-supervised setting, we propose a Hierarchical Regularized GAN (HiR-GAN), which allows a controllable instance generation process. In the proposed model, a pedestrian instance is synthesized and stitched on top of a background image. By imposing shape-based regularization on intermediate results and complete pedestrian images, a generator learns to disentangle pedestrian shape and appearance without any supervision. As a result, we achieve effective control of background, shape, and appearance via explicit latent codes. Further, the generator competes with two discriminators, which focus more on pedestrian shape and image quality, respectively. By jointly training a background-pedestrian classifier, the generator is guided to capture more precise semantics of real pedestrian data, while at the same time the classifier is able to produce more accurate annotations of unlabeled data. Experiment results on multiple benchmarks demonstrate the superiority of HiR-GAN over previous state-of-the-arts in semi-supervised pedestrian image synthesis and the effectiveness of high-fidelity synthesized data in reducing the dependence on labeled data in the pedestrian detection task. © 2024 Elsevier B.V.
KW - Generative adversarial nets
KW - Hierarchical regularization
KW - Pedestrian image synthesis
KW - Semi-supervised learning
UR - http://www.scopus.com/inward/record.url?scp=85211062178&partnerID=8YFLogxK
UR - https://www.scopus.com/record/pubmetrics.uri?eid=2-s2.0-85211062178&origin=recordpage
U2 - 10.1016/j.neucom.2024.128831
DO - 10.1016/j.neucom.2024.128831
M3 - RGC 21 - Publication in refereed journal
SN - 0925-2312
VL - 618
JO - Neurocomputing
JF - Neurocomputing
M1 - 128831
ER -