Improving RGB-Thermal Semantic Scene Understanding with Synthetic Data Augmentation for Autonomous Driving

Haotian Li, Henry K. Chu, Yuxiang Sun*

*Corresponding author for this work

Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review

Abstract

Semantic scene understanding is an important capability for autonomous vehicles. Despite recent advances in RGB-Thermal (RGB-T) semantic segmentation, existing methods often rely on parameter-heavy models, which are particularly constrained by the lack of precisely-labeled training data. To alleviate this limitation, we propose a data-driven method, SyntheticSeg, to enhance RGB-T semantic segmentation. Specifically, we utilize generative models to generate synthetic RGB-T images from the semantic layouts in real datasets and construct a large-scale, high-fidelity synthetic dataset to provide the segmentation models with sufficient training data. We also introduce a novel metric that measures both the scarcity and segmentation difficulty of semantic layouts, guiding sampling from the synthetic dataset to alleviate class imbalance and improve the overall segmentation performance. Experimental results on a public dataset demonstrate our superior performance over the state of the arts. © 2025 IEEE.
Original languageEnglish
Pages (from-to)4452-4459
JournalIEEE Robotics and Automation Letters
Volume10
Issue number5
Online published5 May 2025
DOIs
Publication statusPublished - May 2025

Funding

This article was recommended for publication by Associate Editor M. Ramezani and Editor C. Cadena upon evaluation of the reviewers’ comments. This work was supported in part by Hong Kong Innovation and Technology Fund under Grant ITS/145/21 and in part by City University of Hong Kong under Grant 9610675.

Research Keywords

  • Autonomous Driving
  • RGB-T Fusion
  • Semantic Scene Understanding
  • Synthetic Image Generation

Fingerprint

Dive into the research topics of 'Improving RGB-Thermal Semantic Scene Understanding with Synthetic Data Augmentation for Autonomous Driving'. Together they form a unique fingerprint.

Cite this