Abstract
Robust sequential decision-making under uncertainty is a fundamental problem in operations research, machine learning, and supply chain management. It involves designing policies that perform well when system dynamics or rewards are not known exactly, but are subject to uncertainty. Such problems arise in many applications, including reinforcement learning and inventory control, where decisions must be made over time without full knowledge. This thesis develops robust models and solution methods to improve decision quality in uncertain and dynamic environments.The first part of the thesis introduces a distributionally robust Markov decision process (MDPs) model based on Wasserstein ambiguity sets. In this model, the transition kernels are treated as random vectors whose distribution is unknown but belongs to a set defined by the Wasserstein distance. We develop efficient algorithms with quasi-linear time complexity to compute the distributionally robust optimal values and corresponding policies. Numerical experiments confirm the efficiency and scalability of our proposed approaches.
Besides the transition kernels, the cost functions are often uncertain and suffer from ambiguity, particularly in constrained MDPs where multiple types of costs must be considered. Moreover, classical constrained MDP models are risk-neutral and may lead to unsafe decisions in deployment. To address these issues, the second part of the thesis proposes a new ambiguous constrained MDP model that incorporates worst-case risk measures under scenario-wise ambiguity sets. We reformulate the model as a convex optimization problem and design a first-order algorithm to solve it efficiently. Numerical results show that our model improves decision reliability, and our algorithm outperforms commercial solvers in terms of computational speed.
As shown in the first two parts of the thesis, practical models often involve multiple sources of uncertainty. In the third part, we study an inventory control problem with both uncertain demand and yield. While these two types of uncertainty have been examined separately in the other literatures, our work takes the first step in addressing them jointly within an adaptive distributionally robust optimization framework. We introduce a new decision rule, called the doubly linear decision rule, which allows for a tractable reformulation of the problem. We assess the performance of this rule through numerical experiments, showing that it achieves both cost efficiency and robustness compared to benchmark strategies, while retaining computational tractability.
In summary, this thesis investigates robust sequential decision-making models under uncertainty. By adopting ambiguous chance constraints and doubly linear decision rules, we contribute to the development of reliable and scalable decision-making methods for dynamic systems affected by multiple sources of uncertainty.
| Date of Award | 29 Sept 2025 |
|---|---|
| Original language | English |
| Awarding Institution |
|
| Supervisor | Chin Pang HO (Supervisor) |
Cite this
- Standard