Cooperative traffic signal control using Multi-step return and Off-policy Asynchronous Advantage Actor-Critic Graph algorithm

Research output: Journal Publications and Reviews (RGC: 21, 22, 62)21_Publication in refereed journalpeer-review

28 Scopus Citations
View graph of relations


Related Research Unit(s)


Original languageEnglish
Article number104855
Journal / PublicationKnowledge-Based Systems
Online published22 Jul 2019
Publication statusPublished - 1 Nov 2019


Intelligent traffic signal control helps to reduce traffic congestion and thus has been studied for a few decades. Multi-intersection cooperative traffic signal control (CTSC), which is more practical than single-intersection traffic signal control, has attracted much attention and research in recent years. Existing works on multi-intersection CTSC make responsive policies based on the sequence of agents’ actions. One issue in multi-intersection CTSC is that every agent's actions are mapped from its own road information and some useful information, e.g., the distance of adjacent agents, is ignored, which may lead to suboptimal traffic signal control policies. To address this issue, in this paper a decentralized coordination graph algorithm, referred to as Multi-step return and Off-policy Asynchronous Advantage Actor-Critic Graph (MOA3CG) algorithm, is proposed. The MOA3CG algorithm is based on an asynchronous method of multiagent deep reinforcement learning and a coordination graph; the proposed algorithm makes traffic signal control policies based on current traffic states, the history of observations and other information. A new reward function and An Adjusting Matrix of Traffic Signal Phase Control (AMTSPC) are proposed, which are used by the MOA3CG algorithm in the policy-making process; the AMTSPC is to alter selection of actions by considering the distance of adjacent agents. Experimental results on real-world road scenarios show that the proposed algorithm outperforms other four state-of-the-art algorithms in terms of average delay, average traveling time of vehicles, and the throughput of vehicles, thus eventually helps to mitigate traffic congestion.

Research Area(s)

  • Asynchronous Advantage Actor-Critic (A3C) algorithm, Cooperative traffic signal control, Coordination graph algorithm, Multiagent deep reinforcement learning, Transfer planning