MATLIT : MAT-Based Cooperative Reinforcement Learning for Urban Traffic Signal Control

Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review

View graph of relations

Author(s)

  • Kaixiang Su
  • Enshu Wang
  • Weizhen Han
  • Libing Wu
  • Chunming Qiao

Related Research Unit(s)

Detail(s)

Original languageEnglish
Journal / PublicationIEEE Transactions on Intelligent Transportation Systems
Online published11 Feb 2025
Publication statusOnline published - 11 Feb 2025

Abstract

Effective multi-intersection collaboration is crucial for mitigating urban traffic congestion through reinforcement learning (RL)-based traffic signal control (TSC). Existing work mainly considers scenarios involving a single vehicle type, where cooperation is typically limited to neighboring intersections. However, in urban traffic scenarios where high priority vehicles coexist with ordinary vehicles, considering only a limited number of neighboring nodes may be insufficient to ensure the swift passage of high priority vehicles while minimizing the impact on overall traffic efficiency. Therefore, we formulate the multiple intersections’ decision-making process in urban scenarios as a Markov game and propose a novel centralized cooperative RL framework called MATLIT to solve the game. Specifically, we adopt a multi-agent transformer (MAT)-based architecture that facilitates efficient global cooperation among intersections. The attention mechanism and auto-regressive process of the MAT effectively mitigate the curse of the dimensionality problem, which guarantees MATLIT to tackle large-scale traffic scenarios. Meanwhile, the stability and sequence action generation capacity of the MAT-based architecture is further enhanced by incorporating MAT with a gated mechanism. Furthermore, considering the inherent topological constraints in urban traffic scenarios, we utilize graph attention networks (GATs) to capture graph-structured mutual influences. Additionally, in response to the urban traffic scenarios with various types of high priority vehicles that have time-varying priorities, we integrate the soft actor-critic (SAC) algorithm to enhance the exploration capabilities of our framework, allowing it to learn robust strategies in heterogeneous traffic conditions. Extensive experiments demonstrate that our proposed MATLIT framework outperforms all baselines and can reduce high priority vehicles’ waiting time by 24.57% while reducing the average waiting time of all vehicles by 18.51% in realistic urban scenarios.

© 2025 IEEE. All rights reserved, including rights for text and data mining, and training of artificial intelligence and similar technologies. Personal use is permitted, but republication/redistribution requires IEEE permission.