TY - JOUR
T1 - Neuro-dynamic programming for optimal control of macroscopic fundamental diagram systems
AU - Su, Z.C.
AU - Chow, Andy H.F.
AU - Zheng, N.
AU - Huang, Y.P.
AU - Liang, E.M.
AU - Zhong, R.X.
PY - 2020/7
Y1 - 2020/7
N2 - The macroscopic fundamental diagram (MFD) can effectively reduce the spatial dimension involved in dynamic optimization of traffic performance for large-scale networks. Solving the Hamilton-Jacobi-Bellman (HJB) equation takes center stage in yielding solutions to the optimal control problem. At the core of solving the HJB equation is the value function that represents choosing a sequence of actions to optimize the system performance. However, this problem generally becomes intractable for possible discontinuities in the solution and the curse of dimensionality for systems with all but modest dimension. To address these challenges, a neural network is used to approximate the value function to obtain the optimal controls through policy iteration. Furthermore, a saturated operator is embedded in the neural network approximator to handle the difficulty caused by the control and state constraints. This policy iteration can be implemented as an iterative data-driven technique that integrates with the model-based optimal design based on real-time observations. Numerical experiments are conducted to show that the neuro-dynamic programming approach can achieve optimization goals while stabilizing the system by regulating the traffic state to the desired uncongested equilibrium.
AB - The macroscopic fundamental diagram (MFD) can effectively reduce the spatial dimension involved in dynamic optimization of traffic performance for large-scale networks. Solving the Hamilton-Jacobi-Bellman (HJB) equation takes center stage in yielding solutions to the optimal control problem. At the core of solving the HJB equation is the value function that represents choosing a sequence of actions to optimize the system performance. However, this problem generally becomes intractable for possible discontinuities in the solution and the curse of dimensionality for systems with all but modest dimension. To address these challenges, a neural network is used to approximate the value function to obtain the optimal controls through policy iteration. Furthermore, a saturated operator is embedded in the neural network approximator to handle the difficulty caused by the control and state constraints. This policy iteration can be implemented as an iterative data-driven technique that integrates with the model-based optimal design based on real-time observations. Numerical experiments are conducted to show that the neuro-dynamic programming approach can achieve optimization goals while stabilizing the system by regulating the traffic state to the desired uncongested equilibrium.
KW - Macroscopic fundamental diagram
KW - Hamilton-Jacobi-Bellman equation
KW - euro-dynamic programming
KW - Policy iteration
KW - Saturated state and input
UR - http://www.scopus.com/inward/record.url?scp=85084421696&partnerID=8YFLogxK
UR - https://www.scopus.com/record/pubmetrics.uri?eid=2-s2.0-85084421696&origin=recordpage
U2 - 10.1016/j.trc.2020.102628
DO - 10.1016/j.trc.2020.102628
M3 - 21_Publication in refereed journal
VL - 116
JO - Transportation Research. Part C: Emerging Technologies
JF - Transportation Research. Part C: Emerging Technologies
SN - 0968-090X
M1 - 102628
ER -