Distributed online bandit optimization under random quantization

Research output: Journal Publications and Reviews (RGC: 21, 22, 62)21_Publication in refereed journalpeer-review

1 Scopus Citations
View graph of relations

Author(s)

Related Research Unit(s)

Detail(s)

Original languageEnglish
Article number110590
Journal / PublicationAutomatica
Volume146
Online published6 Sep 2022
Publication statusPublished - Dec 2022

Abstract

This paper considers the problem of solving distributed online optimization over a network that consists of multiple interacting nodes. Each node in the network is endowed with a sequence of loss functions, each of which is revealed to the node after a decision has been committed. The goal of the network is to minimize the cumulative loss functions of nodes in a distributed fashion, while subject to two types of information constraints, namely, message quantization and bandit feedback. To this end, a quantized distributed online bandit optimization algorithm is proposed by adopting random quantization operation and one-point gradient estimator. We show the convergence of the algorithm by establishing an O(dT3/4) regret bound, where d is the dimension of states and T is the total number of rounds. Finally, an online distributed quadratic programming problem is investigated to validate the theoretical findings of the paper.

Research Area(s)

  • Bandit feedback, Online distributed optimization, Random quantization, Regret bound

Citation Format(s)

Distributed online bandit optimization under random quantization. / Yuan, Deming; Zhang, Baoyong; Ho, Daniel W.C. et al.

In: Automatica, Vol. 146, 110590, 12.2022.

Research output: Journal Publications and Reviews (RGC: 21, 22, 62)21_Publication in refereed journalpeer-review