Simulation-Based Decision Making with Covariates


Student thesis: Doctoral Thesis

View graph of relations



Awarding Institution
Award date27 Jul 2021


Large-scale complex systems are commonly seen in the application of systems engineering. Such systems usually contain complex operation logic and dynamics and do not satisfy the assumptions of analytical models. As a result, practitioners often resort to the tool of simulation for accurately describing the performance of these systems and making decisions, e.g., in manufacturing, supply chain, transportation, healthcare, finance, etc.

In simulation-based decision making, the common practice is to conduct simulation experiments to collect samples for system performance on each design and select the best design based on the simulation samples. Several challenges could lead to the poor quality of the selected design. 1) The simulation samples are stochastic, but the total number of samples (simulation budget) is often limited. By the law of large numbers, the selection for the best design by simulation cannot be highly accurate if the designs are not sufficiently simulated. Still, we can never identify the best design with probability one given a finite simulation budget. 2) The necessity of accurately evaluating the system performance varies among different designs. Intuitively, when comparing the designs and selecting the best, we can easily exclude a design whose performance is obviously suboptimal and spend more simulation efforts on the designs that are likely to be the best one according to sampling. To enable the high-quality selection, we need new effective techniques to determine the number of samples allocated to each design.

In addition to designs, the system performance also depends on the covariates. Here covariates often refer to some parameters of the systems that are not under control by the decision makers. For example, the different time periods in the transportation system and the patient's biometric information in healthcare services are the covariates for the respective decision problems. It is obvious that a fixed design cannot be the best for all the possible values of covariates. The transportation system design with a high service rate is favorable in the rush hour but causes resource wastes in the idle period. In healthcare services, a treatment that works the best for most patients might cause serious allergies for some patients. 

We call the selection of the best design in the presence of covariates as simulation-based decision making with covariates. It aims to select the best design for each possible value of covariates that might appear in practice. This problem is emerging as sensors deployed on large-scale systems can provide massive data of covariates nowadays. In this thesis, we develop efficient algorithms for the simulation-based decision making with covariates.

Specifically, we consider two cases for this problem. First, the covariates take a finite number of values. In this case, the goal is to determine the optimal allocation of the simulation budget among various values of covariates and designs to efficiently identify the best design for each value of covariates that might possibly appear. We call this problem contextual ranking and selection (CR&S). We utilize the OCBA approach in classic ranking and selection and solve the problem by developing appropriate objective measures, identifying the rate-optimal budget allocation rule, and analyzing the convergence of the selection algorithm. We numerically test the performance of the proposed algorithm via a set of abstract problems and show the superiority of the algorithm.

The second case is that the covariates are continuous. In this case, we adopt the stochastic kriging (SK) models to learn the relationship between the system performance and covariates for each design and use the SK models to predict the system performance and the best design under a new value of covariates. We study how fast the prediction errors converge with the number of covariate points sampled. It serves to quantify the relationship between the sampling efforts and the prediction quality. Particularly, we develop measures for assessing the prediction errors and establish convergence rates for these measures under different covariance kernels and conditions. 

Last, we test the proposed algorithms developed in the two cases to real problems. We apply the CR&S Algorithm and the stochastic kriging-assisted selection algorithm to the prevention of cervical cancer and esophageal adenocarcinoma, two common types of cancer. The numerical results show that the proposed methods are highly efficient in solving simulation-based decision making with covariates.