TY - JOUR
T1 - Cooperative traffic signal control using Multi-step return and Off-policy Asynchronous Advantage Actor-Critic Graph algorithm
AU - Yang, Shantian
AU - Yang, Bo
AU - Wong, Hau-San
AU - Kang, Zhongfeng
PY - 2019/11/1
Y1 - 2019/11/1
N2 - Intelligent traffic signal control helps to reduce traffic congestion and thus has been studied for a few decades. Multi-intersection cooperative traffic signal control (CTSC), which is more practical than single-intersection traffic signal control, has attracted much attention and research in recent years. Existing works on multi-intersection CTSC make responsive policies based on the sequence of agents’ actions. One issue in multi-intersection CTSC is that every agent's actions are mapped from its own road information and some useful information, e.g., the distance of adjacent agents, is ignored, which may lead to suboptimal traffic signal control policies. To address this issue, in this paper a decentralized coordination graph algorithm, referred to as Multi-step return and Off-policy Asynchronous Advantage Actor-Critic Graph (MOA3CG) algorithm, is proposed. The MOA3CG algorithm is based on an asynchronous method of multiagent deep reinforcement learning and a coordination graph; the proposed algorithm makes traffic signal control policies based on current traffic states, the history of observations and other information. A new reward function and An Adjusting Matrix of Traffic Signal Phase Control (AMTSPC) are proposed, which are used by the MOA3CG algorithm in the policy-making process; the AMTSPC is to alter selection of actions by considering the distance of adjacent agents. Experimental results on real-world road scenarios show that the proposed algorithm outperforms other four state-of-the-art algorithms in terms of average delay, average traveling time of vehicles, and the throughput of vehicles, thus eventually helps to mitigate traffic congestion.
AB - Intelligent traffic signal control helps to reduce traffic congestion and thus has been studied for a few decades. Multi-intersection cooperative traffic signal control (CTSC), which is more practical than single-intersection traffic signal control, has attracted much attention and research in recent years. Existing works on multi-intersection CTSC make responsive policies based on the sequence of agents’ actions. One issue in multi-intersection CTSC is that every agent's actions are mapped from its own road information and some useful information, e.g., the distance of adjacent agents, is ignored, which may lead to suboptimal traffic signal control policies. To address this issue, in this paper a decentralized coordination graph algorithm, referred to as Multi-step return and Off-policy Asynchronous Advantage Actor-Critic Graph (MOA3CG) algorithm, is proposed. The MOA3CG algorithm is based on an asynchronous method of multiagent deep reinforcement learning and a coordination graph; the proposed algorithm makes traffic signal control policies based on current traffic states, the history of observations and other information. A new reward function and An Adjusting Matrix of Traffic Signal Phase Control (AMTSPC) are proposed, which are used by the MOA3CG algorithm in the policy-making process; the AMTSPC is to alter selection of actions by considering the distance of adjacent agents. Experimental results on real-world road scenarios show that the proposed algorithm outperforms other four state-of-the-art algorithms in terms of average delay, average traveling time of vehicles, and the throughput of vehicles, thus eventually helps to mitigate traffic congestion.
KW - Asynchronous Advantage Actor-Critic (A3C) algorithm
KW - Cooperative traffic signal control
KW - Coordination graph algorithm
KW - Multiagent deep reinforcement learning
KW - Transfer planning
UR - http://www.scopus.com/inward/record.url?scp=85069686288&partnerID=8YFLogxK
UR - https://www.scopus.com/record/pubmetrics.uri?eid=2-s2.0-85069686288&origin=recordpage
U2 - 10.1016/j.knosys.2019.07.026
DO - 10.1016/j.knosys.2019.07.026
M3 - RGC 21 - Publication in refereed journal
SN - 0950-7051
VL - 183
JO - Knowledge-Based Systems
JF - Knowledge-Based Systems
M1 - 104855
ER -