Abstract
Semantic scene understanding is an important capability for autonomous vehicles. Despite recent advances in RGB-Thermal (RGB-T) semantic segmentation, existing methods often rely on parameter-heavy models, which are particularly constrained by the lack of precisely-labeled training data. To alleviate this limitation, we propose a data-driven method, SyntheticSeg, to enhance RGB-T semantic segmentation. Specifically, we utilize generative models to generate synthetic RGB-T images from the semantic layouts in real datasets and construct a large-scale, high-fidelity synthetic dataset to provide the segmentation models with sufficient training data. We also introduce a novel metric that measures both the scarcity and segmentation difficulty of semantic layouts, guiding sampling from the synthetic dataset to alleviate class imbalance and improve the overall segmentation performance. Experimental results on a public dataset demonstrate our superior performance over the state of the arts. © 2025 IEEE.
Original language | English |
---|---|
Pages (from-to) | 4452-4459 |
Journal | IEEE Robotics and Automation Letters |
Volume | 10 |
Issue number | 5 |
Online published | 5 May 2025 |
DOIs | |
Publication status | Published - May 2025 |
Funding
This article was recommended for publication by Associate Editor M. Ramezani and Editor C. Cadena upon evaluation of the reviewers’ comments. This work was supported in part by Hong Kong Innovation and Technology Fund under Grant ITS/145/21 and in part by City University of Hong Kong under Grant 9610675.
Research Keywords
- Autonomous Driving
- RGB-T Fusion
- Semantic Scene Understanding
- Synthetic Image Generation