Abstract
Autonomous driving systems demand tight coordination between prediction and planning to ensure safety in complex urban environments. While modular architectures separating these two tasks offer practical advantages in development flexibility, they inherently struggle with cascading prediction errors and semantic fragmentation, where critical interaction patterns captured during prediction are inadequately utilized in planning. This dissertation addresses the fundamental challenge of bridging prediction and planning through modular integration, enabling robust decision-making without sacrificing system maintainability.Existing modular approaches face three key limitations: (1) prediction models often employ decoupled representations for agent-agent and agent-map interactions, leading to inefficient environment modeling; (2) traditional planners primarily utilize trajectory waypoints while ignoring richer interaction semantics from prediction; (3) learningbased planners lack standardized interfaces to harness advanced prediction outputs like multimodal trajectories or joint probability distributions. These gaps result in suboptimal planning performance, particularly in urban traffic scenarios that involve complex interactions.
To bridge these gaps, we present three synergistic approaches. First, we propose
the Temporal Occupancy Flow Graph (TOFG) to establish a unified spatiotemporal representation that jointly encodes high-definition maps and agent trajectories through fine-grained lane segmentation. The TOFG-GAT prediction model derived from this representation reduces model parameters by 12%, while achieving state-of-the-art prediction accuracy, thereby elevating planning performance through enhanced prediction reliability. Second, for traditional sampling-based planners, we introduce TOFGattention-guided FMT*, which explicitly integrates TOFG-GAT’s interaction attention maps into motion planning. This method prioritizes sampling in high-interaction regions, improving the overall planning success rate and efficiency. Third, to generalize prediction-planning integration, we propose MAPLE, a modular framework that standardizes the translation of diverse trajectory prediction outputs into latent features compatible with any learning-based planner. MAPLE improves closed-loop planning performance by 3-9% across nuPlan scenarios through non-intrusive feature fusion, demonstrating compatibility with multiple planner architectures and prediction models.
Collectively, this research demonstrates that modular autonomous driving systems can achieve superior performance through integration of prediction and planning. By systematically addressing representation heterogeneity, prediction semantic underspecification, and interface incompatibility across modules, our framework preserves the practical benefits of modular architectures-upgradability, interpretability, and computational efficiency, while significantly enhancing planning performance in dense traffic scenarios. Experiments on various benchmarks confirm these advancements, providing a viable path toward deployable yet adaptable autonomous driving systems.
| Date of Award | 5 Sept 2025 |
|---|---|
| Original language | English |
| Awarding Institution |
|
| Supervisor | Jianping WANG (Supervisor) & Xiaonan Nancy YU (Co-supervisor) |