Distributed online bandit optimization under random quantization
Research output: Journal Publications and Reviews (RGC: 21, 22, 62) › 21_Publication in refereed journal › peer-review
Author(s)
Related Research Unit(s)
Detail(s)
Original language | English |
---|---|
Article number | 110590 |
Journal / Publication | Automatica |
Volume | 146 |
Online published | 6 Sep 2022 |
Publication status | Published - Dec 2022 |
Link(s)
Abstract
This paper considers the problem of solving distributed online optimization over a network that consists of multiple interacting nodes. Each node in the network is endowed with a sequence of loss functions, each of which is revealed to the node after a decision has been committed. The goal of the network is to minimize the cumulative loss functions of nodes in a distributed fashion, while subject to two types of information constraints, namely, message quantization and bandit feedback. To this end, a quantized distributed online bandit optimization algorithm is proposed by adopting random quantization operation and one-point gradient estimator. We show the convergence of the algorithm by establishing an O(dT3/4) regret bound, where d is the dimension of states and T is the total number of rounds. Finally, an online distributed quadratic programming problem is investigated to validate the theoretical findings of the paper.
Research Area(s)
- Bandit feedback, Online distributed optimization, Random quantization, Regret bound
Citation Format(s)
Distributed online bandit optimization under random quantization. / Yuan, Deming; Zhang, Baoyong; Ho, Daniel W.C. et al.
In: Automatica, Vol. 146, 110590, 12.2022.Research output: Journal Publications and Reviews (RGC: 21, 22, 62) › 21_Publication in refereed journal › peer-review