Combinatorial Multivariant Multi-Armed Bandits with Applications to Episodic Reinforcement Learning and Beyond

Xutong Liu, Siwei Wang, Jinhang Zuo, Han Zhong, Xuchuang Wang, Zhiyong Wang, Shuai Li, Mohammad Hajiesmaili, John C.S. Lui, Wei Chen

Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review

Abstract

We introduce a novel framework of combinatorial multi-armed bandits (CMAB) with multivariant and probabilistically triggering arms (CMAB-MT), where the outcome of each arm is a ddimensional multivariant random variable and the feedback follows a general arm triggering process. Compared with existing CMAB works, CMAB-MT not only enhances the modeling power but also allows improved results by leveraging distinct statistical properties for multivariant random variables. For CMAB-MT, we propose a general 1-norm multivariant and triggering probability-modulated smoothness condition, and an optimistic CUCB-MT algorithm built upon this condition. Our framework can include many important problems as applications, such as episodic reinforcement learning (RL) and probabilistic maximum coverage for goods distribution, all of which meet the above smoothness condition and achieve matching or improved regret bounds compared to existing works. Through our new framework, we build the first connection between the episodic RL and CMAB literature, by offering a new angle to solve the episodic RL through the lens of CMAB, which may encourage more interactions between these two important directions. Copyright 2024 by the author(s)
Original languageEnglish
Title of host publicationProceedings of the 41st International Conference on Machine Learning
Pages32139-32172
Publication statusPublished - Jul 2024
Externally publishedYes
Event41st International Conference on Machine Learning, ICML 2024 - Vienna, Austria
Duration: 21 Jul 202427 Jul 2024
https://proceedings.mlr.press/v235/

Publication series

NameProceedings of Machine Learning Research
Volume235
ISSN (Print)2640-3498

Conference

Conference41st International Conference on Machine Learning, ICML 2024
Country/TerritoryAustria
CityVienna
Period21/07/2427/07/24
Internet address

Funding

The work of Xutong Liu was done during his visit at University of Massachusetts Amherst. The work of John C.S. Lui was supported in part by the RGC GRF 14215722. The work of Mohammad Hajiesmaili was supported by CPS-2136199, CNS-2106299, CNS-2102963, CCF-2325956, and CAREER-2045641. The corresponding author Shuai Li is supported by National Science and Technology Major Project (2022ZD0114804) and is partly supported by the Guangdong Provincial Key Laboratory of Mathematical Foundations for Artificial Intelligence (2023B1212010001).

Fingerprint

Dive into the research topics of 'Combinatorial Multivariant Multi-Armed Bandits with Applications to Episodic Reinforcement Learning and Beyond'. Together they form a unique fingerprint.

Cite this