Skip to main navigation Skip to search Skip to main content

Robust φ-Divergence MDPs

Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review

Abstract

In recent years, robust Markov decision processes (MDPs) have emerged as a prominent modeling framework for dynamic decision problems affected by uncertainty. In contrast to classical MDPs, which only account for stochasticity by modeling the dynamics through a stochastic process with a known transition kernel, robust MDPs additionally account for ambiguity by optimizing in view of the most adverse transition kernel from a prescribed ambiguity set. In this paper, we develop a novel solution framework for robust MDPs with s-rectangular ambiguity sets that decomposes the problem into a sequence of robust Bellman updates and simplex projections. Exploiting the rich structure present in the simplex projections corresponding to φ-divergence ambiguity sets, we show that the associated s-rectangular robust MDPs can be solved substantially faster than with state-of-the-art commercial solvers as well as a recent first-order solution scheme, thus rendering them attractive alternatives to classical MDPs in practical applications. © 2022 Neural information processing systems foundation. All rights reserved.
Original languageEnglish
Title of host publicationThirty-Sixth Conference on Neural Information Processing Systems, NeurIPS 2022
EditorsS. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, A. Oh
PublisherNeural Information Processing Systems (NeurIPS)
Number of pages14
ISBN (Print)9781713871088
Publication statusPublished - Nov 2022
Event36th Conference on Neural Information Processing Systems (NeurIPS 2022) - Hybrid, New Orleans Convention Center, New Orleans, United States
Duration: 28 Nov 20229 Dec 2022
https://neurips.cc/
https://nips.cc/Conferences/2022
https://proceedings.neurips.cc/paper_files/paper/2022

Publication series

NameAdvances in Neural Information Processing Systems
Volume35
ISSN (Print)1049-5258

Conference

Conference36th Conference on Neural Information Processing Systems (NeurIPS 2022)
Abbreviated titleNIPS '22
PlaceUnited States
CityNew Orleans
Period28/11/229/12/22
Internet address

Funding

This work was supported, in part, by the Engineering and Physical Sciences Research Council (EPSRC) grant EP/W003317/1, by the CityU Start-Up Grant (Project No. 9610481), by the National Natural Science Foundation of China (Project No. 72032005), by the Chow Sang Sang Group Research Fund sponsored by Chow Sang Sang Holdings International Limited (Project No. 9229076), and by the NSF grant No. 1815275. Any opinion, finding, conclusion, or recommendation expressed in this material are those of the authors and do not necessarily reflect the views of the Engineering and Physical Sciences Research Council and the National Natural Science Foundation of China.

Fingerprint

Dive into the research topics of 'Robust φ-Divergence MDPs'. Together they form a unique fingerprint.

Cite this