Adaptive Metro Service Schedule and Train Composition With a Proximal Policy Optimization Approach Based on Deep Reinforcement Learning

Cheng-Shuo Ying, Andy H. F. Chow*, Yi-Hui Wang, Kwai-Sang Chin

*Corresponding author for this work

Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review

46 Citations (Scopus)

Abstract

This paper presents an integrated metro service scheduling and train unit deployment with a proximal policy optimization approach based on the deep reinforcement learning framework. The optimization problem is formulated as a Markov decision process (MDP) subject to a set of operational constraints. To address the computational complexity, the value function and control policy are parameterized by artificial neural networks (ANNs) with which the operational constraints are incorporated through a devised mask scheme. A proximal policy optimization (PPO) approach is developed for training the ANNs via successive transition simulations. The optimization framework is implemented and tested on a real-world scenario configured with the Victoria Line of London Underground, UK. The results show that the performance of proposed methodology outperforms a set of selected evolutionary heuristics in terms of both solution quality and computational efficiency. Results illustrate the advantages of having flexible train composition in saving operational costs and reducing service irregularities. This study contributes to real time metro operations with limited resources and state-of-art optimization techniques.
Original languageEnglish
Pages (from-to)6895-6906
JournalIEEE Transactions on Intelligent Transportation Systems
Volume23
Issue number7
Online published11 Mar 2021
DOIs
Publication statusPublished - Jul 2022

Research Keywords

  • deep reinforcement learning
  • Dynamic scheduling
  • Markov decision process
  • Metro service scheduling
  • Processor scheduling
  • proximal policy optimization.
  • Reinforcement learning
  • Schedules
  • Scheduling
  • train composition
  • Training
  • Urban areas

Fingerprint

Dive into the research topics of 'Adaptive Metro Service Schedule and Train Composition With a Proximal Policy Optimization Approach Based on Deep Reinforcement Learning'. Together they form a unique fingerprint.
  • TBRS: Safety, Reliability, and Disruption Management of High Speed Rail and Metro Systems

    XIE, M. (Principal Investigator / Project Coordinator), BENSOUSSAN, A. (Co-Principal Investigator), LO, S. M. (Co-Principal Investigator), SHOU, B. (Co-Principal Investigator), SINGPURWALLA, N. D. (Co-Principal Investigator), TSE, W. T. P. (Co-Principal Investigator), TSUI, K. L. (Co-Principal Investigator), YU, Y. (Co-Principal Investigator), YUEN, K. K. R. (Co-Principal Investigator), CHAN, A. B. (Co-Investigator), CHAN, N.-H. (Co-Investigator), CHIN, K. S. (Co-Investigator), CHOW, H. A. (Co-Investigator), Chow, W. K. (Co-Investigator), EDESESS, M. (Co-Investigator), GOLDSMAN, D. M. (Co-Investigator), Huang, J. (Co-Investigator), LEE, W. M. (Co-Investigator), LI, L. (Co-Investigator), LI, C. L. (Co-Investigator), LING, M. H. A. (Co-Investigator), LIU, S. (Co-Investigator), MURAKAMI, J. (Co-Investigator), NG, S. Y. S. (Co-Investigator), NI, M. C. (Co-Investigator), TAN, M.H.-Y. (Co-Investigator), Wang, W. (Co-Investigator), Wang, J. (Co-Investigator), WONG, C. K. (Co-Investigator), WONG, S. Y. Z. (Co-Investigator), WONG, S. C. (Co-Investigator), Xu, Z. (Co-Investigator), ZHANG, Z. (Co-Investigator), Zhang, D. (Co-Investigator), ZHAO, J. L. (Co-Investigator) & Zhou, Q. (Co-Investigator)

    1/01/1631/12/21

    Project: Research

Cite this