Essays on the Thresholding Bandit Problem and Simulation Input Uncertainty


Student thesis: Doctoral Thesis

View graph of relations


Related Research Unit(s)


Awarding Institution
Award date23 Aug 2019


In this thesis, we study two problems in the context of estimation for stochastic systems with noise, including the thresholding bandit problem (TBP) where one aims to identify the systems with means greater than a given threshold, and an input uncertainty problem where one wants to quantify the uncertainty of simulation outputs resulting from misspecification of input distributions due to limited data.

The TBP is a variant of the multi-armed bandit (MAB) problem that attempts to maximize the total reward by carefully balancing the trade-off between exploration and exploitation. Unlike the traditional MAB problem, the TBP is a pure exploration problem, aiming at identifying as accurately as possible the systems with means greater than the given threshold with a fixed amount of sampling budget. While traditional MAB has been intensively studied in the literature, the research on the TBP remains relatively underdeveloped. In the first essay of this thesis, a parameter-free policy for the TBP is proposed based on large-deviation theory, referred to as a large deviation (LD) algorithm. We show that the LD algorithm is asymptotically optimal in the sense that its probability of incorrect identification achieves the fastest rate of convergence as sample size goes to infinity. However, the LD algorithm requires solving an optimization problem to obtain the large deviation decay rate, which may lead to large errors when sample size is small. As an alternative, we further propose an approximate large-deviation (ALD) algorithm based on an approximation of the decay rate using sample means and sample variances which is easier to implement. Numerical experiments show that both algorithms may outperform existing ones in many cases.

In the second essay, we study how to quantify simulation input uncertainty when the input distributions to a simulation model is obtained by fitting to a limited amount of data. We propose algorithms to construct asymptotically valid confidence intervals (CIs) which take into account input uncertainty based on a bootstrap resampling procedure. The proposed algorithms can be applied to both the cases of parameter uncertainty and distributional uncertainty. We establish statistical guarantees for the proposed algorithms, including their central limit theorems, and consistency of variance estimators. Numerical examples show that the proposed algorithms work well and produce asymptotically CIs that take input uncertainty into consideration in simulation estimation.

    Research areas

  • thresholding bandit problem, exploration problem, asymptotically optimal, fixed budget allocation, parameter free, input uncer tainty, confidence intervals, bootstrap resampling procedure, parameter uncertainty, distributional uncertainty