Skip to main navigation Skip to search Skip to main content

Occlusion-Aware Diffusion Model for Pedestrian Intention Prediction

Yu Liu, Zhijie Liu, Zedong Yang, You-Fu Li, He Kong*

*Corresponding author for this work

Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review

Abstract

Predicting pedestrian crossing intentions is crucial for the navigation of mobile robots and intelligent vehicles. Although recent deep learning-based models have shown significant success in forecasting intentions, few consider incomplete observation under occlusion scenarios. To tackle this challenge, we propose an Occlusion-Aware Diffusion Model (ODM) that reconstructs occluded motion patterns and leverages them to guide future intention prediction. During the denoising stage, we introduce an occlusion-aware diffusion transformer architecture to estimate noise features associated with occluded patterns, thereby enhancing the model’s ability to capture contextual relationships in occluded semantic scenarios. Furthermore, an occlusion mask-guided reverse process is introduced to effectively utilize observation information, reducing the accumulation of prediction errors and enhancing the accuracy of reconstructed motion features. The performance of the proposed method under various occlusion scenarios is comprehensively evaluated and compared with existing methods on popular benchmarks, namely PIE and JAAD. Extensive experimental results demonstrate that the proposed method achieves more robust performance than existing methods in the literature. © 2026 IEEE.
Original languageEnglish
Pages (from-to)3579-3593
JournalIEEE Transactions on Intelligent Transportation Systems
Volume27
Issue number3
Online published27 Feb 2026
DOIs
Publication statusPublished - Mar 2026

Research Keywords

  • deep neural network
  • diffusion
  • occluded observation
  • Pedestrian intention prediction

Fingerprint

Dive into the research topics of 'Occlusion-Aware Diffusion Model for Pedestrian Intention Prediction'. Together they form a unique fingerprint.

Cite this