Value Iteration and Data-Driven Optimal Output Regulation of Linear Continuous-Time Systems

Yi Jiang, Weinan Gao*

*Corresponding author for this work

Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review

2 Citations (Scopus)

Abstract

This paper investigates the linear optimal output regulation problem (LO2RP). We propose a reinforcement learning (RL) based approach to learn the optimal regulator. This problem is solved by tackling two optimization problems, a static constrained optimization problem to find the optimal solution to the output regulation equations and a dynamic programming to obtain the optimal feedback control gain. Instead of relying on the prior knowledge of the system dynamics and an initial stabilizing feedback control gain, a novel online value iteration (VI) algorithm is proposed, which can learn the optimal feedback control gain and feedforward control gain using measurable data. Finally, numerical analysis is provided to show that the proposed approach results in desired disturbance rejection and tracking performance. © 2022 IEEE.
Original languageEnglish
Title of host publicationProceedings of the 34th Chinese Control and Decision Conference
PublisherIEEE
Pages1509-1514
ISBN (Electronic)978-1-6654-7896-0
ISBN (Print)978-1-6654-7897-7
DOIs
Publication statusPublished - Aug 2022
Event34th Chinese Control and Decision Conference, CCDC 2022 - Virtual, Hefei, China
Duration: 15 Aug 202217 Aug 2022

Publication series

NameProceedings of the Chinese Control and Decision Conference, CCDC
ISSN (Print)1948-9439
ISSN (Electronic)1948-9447

Conference

Conference34th Chinese Control and Decision Conference, CCDC 2022
PlaceChina
CityHefei
Period15/08/2217/08/22

Research Keywords

  • optimal output regulation
  • Reinforcement learning (RL)
  • value iteration (VI)

Fingerprint

Dive into the research topics of 'Value Iteration and Data-Driven Optimal Output Regulation of Linear Continuous-Time Systems'. Together they form a unique fingerprint.

Cite this