A Policy Optimization Method Towards Optimal-time Stability
Research output: Chapters, Conference Papers, Creative and Literary Works › RGC 32 - Refereed conference paper (with host publication) › peer-review
Author(s)
Related Research Unit(s)
Detail(s)
Original language | English |
---|---|
Title of host publication | Proceedings of The 7th Conference on Robot Learning |
Editors | Jie Tan, Marc Toussaint, Kourosh Darvish |
Publisher | ML Research Press |
Number of pages | 29 |
Publication status | Published - 2023 |
Publication series
Name | Proceedings of Machine Learning Research |
---|---|
Volume | 229 |
ISSN (Print) | 2640-3498 |
Conference
Title | 2023 Conference on Robot Learning (CoRL 2023) |
---|---|
Place | United States |
City | Atlanta |
Period | 6 - 9 November 2023 |
Link(s)
Document Link | Links
|
---|---|
Link to Scopus | https://www.scopus.com/record/display.uri?eid=2-s2.0-85184352811&origin=recordpage |
Permanent Link | https://scholars.cityu.edu.hk/en/publications/publication(f57b44c2-690f-43c4-954b-efe1d6200d84).html |
Abstract
In current model-free reinforcement learning (RL) algorithms, stability criteria based on sampling methods are commonly utilized to guide policy optimization. However, these criteria only guarantee the infinite-time convergence of the system's state to an equilibrium point, which leads to sub-optimality of the policy. In this paper, we propose a policy optimization technique incorporating sampling-based Lyapunov stability. Our approach enables the system's state to reach an equilibrium point within an optimal time and maintain stability thereafter, referred to as "optimal-time stability". To achieve this, we integrate the optimization method into the Actor-Critic framework, resulting in the development of the Adaptive Lyapunov-based Actor-Critic (ALAC) algorithm. Through evaluations conducted on ten robotic tasks, our approach outperforms previous studies significantly, effectively guiding the system to generate stable patterns. © 2023 Proceedings of Machine Learning Research. All Rights Reserved.
Research Area(s)
- Reinforcement Learning, Robotic Control, Stability
Bibliographic Note
Research Unit(s) information for this publication is provided by the author(s) concerned.
Citation Format(s)
A Policy Optimization Method Towards Optimal-time Stability. / Wang, Shengjie; Lan, Fengbo; Zheng, Xiang et al.
Proceedings of The 7th Conference on Robot Learning. ed. / Jie Tan; Marc Toussaint; Kourosh Darvish. ML Research Press, 2023. (Proceedings of Machine Learning Research; Vol. 229).
Proceedings of The 7th Conference on Robot Learning. ed. / Jie Tan; Marc Toussaint; Kourosh Darvish. ML Research Press, 2023. (Proceedings of Machine Learning Research; Vol. 229).
Research output: Chapters, Conference Papers, Creative and Literary Works › RGC 32 - Refereed conference paper (with host publication) › peer-review