Computing Stationary Equilibria and Their Refinements in Stochastic Games with Differentiable Homotopy Methods
求解隨機博弈中平穩均衡及其精煉的可微同輪算法
Student thesis: Doctoral Thesis
Author(s)
Detail(s)
Awarding Institution | |
---|---|
Supervisors/Advisors |
|
Award date | 26 Jul 2022 |
Link(s)
Permanent Link | https://scholars.cityu.edu.hk/en/theses/theses(fff09b8d-649b-450f-8bbd-7a617419f68d).html |
---|---|
Other link(s) | Links |
Abstract
The concept of the stochastic game serves as a powerful mechanism for strategic interaction analysis in a dynamic environment with conflicts of interest. However, the structure of a stochastic game is very complicated, making it difficult to investigate and compute its equilibria. The notion of the subgame perfect equilibrium in stationary strategies (SSPE) provides a rational and tractable equilibrium form and has been confirmed as one of the most crucial solution concepts in stochastic games.
Although the computation of an SSPE and its refinements is significant in applications of stochastic games, it remains a challenging problem in the literature. One of the very few broadly accepted approaches, specifically for computing SSPEs, is the stochastic linear tracing procedure (SLTP), which was proposed by Herings and Peeters in 2004. Notwithstanding, the SLTP has several limitations in computation. This thesis designs several more effective and efficient methods to compute SSPEs, which can remedy the existing issues with the SLTP. Furthermore, with regard to a stochastic game with multiple SSPEs, we develop several smooth homotopy methods to eliminate some of the counterintuitive equilibria and select a refined stationary equilibrium.
In the existing standard SLTP, all the players are given a prior belief and solve a Markov decision problem (MDP) to obtain their optimal strategy. The starting point of the standard SLTP is the combination of the solutions to those MDPs that need to be explicitly computed using additional algorithms, such as the value iteration algorithm and policy iteration algorithm. In this thesis, we formulate an arbitrary starting linear tracing procedure (ASLTP) by extending the range of the homotopy variable $t$ to $[0,2]$. The starting point of the ASLTP is an arbitrary point at $t=2$. Following the smooth path derived from the ASLTP, one can readily attain the starting point for the standard SLTP as $t$ decreases to one, which requires nearly negligible computational time. As $t$ varies between zero and one, the ASLTP essentially becomes identical to the standard SLTP and eventually converges to an SSPE for the stochastic game of interest. The ASLTP also employs the prior beliefs and preserves the equilibrium selection interpretation possessed by the standard SLTP. The design of the arbitrary starting point integrates the whole computation process, and no additional operation is therefore required for obtaining the starting point. Moreover, a slight perturbation to the homotopy system ensures that the ASLTP is applicable to all stochastic games while the standard SLTP only works for generic cases.
Although the SLTP has been proven as a globally convergent method to compute SSPEs, it has not been designed to achieve the highest numerical efficiency. The homotopy path of the SLTP or ASLTP is only piecewise smooth, and one has to devise a smoothing technique to avoid switching between different systems of equations when following the path, which may lead to additional computational costs. In this thesis, we incorporate the idea of interior-point methods into differentiable homotopy methods and develop an interior-point differentiable path-following method (IPM) to compute SSPEs. Similar to the ASLTP, the IPM is applicable to all stochastic games, and one can arbitrarily choose the starting strategy profile. Furthermore, the IPM brings several other advantages over the existing methods. On the one hand, the homotopy path of the IPM is forced to stay in the interior of its domain so that one does not need to switch between different systems of equations to follow the path. Numerical comparisons show that the IPM is more than three times as efficient as the ASLTP. On the other hand, a stochastic game can be reformulated as a mixed complementarity problem and solved by the PATH solver. We employ the IPM and the PATH solver to solve a number of the same stochastic games. The numerical results reveal that, for some stochastic games, the PATH solver may fail to find an SSPE, while the IPM is successful in doing so for all stochastic games, which confirms the reliability and stability of the IPM.
It is known that a multiplicity of equilibrium points frequently occurs in stochastic games, and some of them do not coincide with our intuition about the outcome of a game. To define a more plausible equilibrium notion, we extend the perfectness and properness concepts for strategic games to stochastic games and formulate two strict refinements of stationary equilibrium: perfect stationary equilibrium (PeSE) and proper stationary equilibrium (PrSE). The computation of the refined stationary equilibria so far remains an open question in the literature. To fill this gap, in this thesis, we form several effective and efficient homotopy methods to compute PeSEs and PrSEs. The key idea is to design a continuously differentiable function of the homotopy variable $t$. The function value remains zero when $t$ varies between zero and a small number. A well-chosen transformation of variables used in the proposed methods resolves the conflict between the interior requirement of the homotopy process and the perfectness or properness criterion. Moreover, for the properness scenario, we propose a compact formulation to reduce the number of constraints in the perturbed strategy space, which provides the insights required to compute proper stationary equilibria for practical problems that can be modeled as stochastic games with a large number of actions. Extensive numerical experiments further affirm the effectiveness and efficiency of the proposed methods.
Although the computation of an SSPE and its refinements is significant in applications of stochastic games, it remains a challenging problem in the literature. One of the very few broadly accepted approaches, specifically for computing SSPEs, is the stochastic linear tracing procedure (SLTP), which was proposed by Herings and Peeters in 2004. Notwithstanding, the SLTP has several limitations in computation. This thesis designs several more effective and efficient methods to compute SSPEs, which can remedy the existing issues with the SLTP. Furthermore, with regard to a stochastic game with multiple SSPEs, we develop several smooth homotopy methods to eliminate some of the counterintuitive equilibria and select a refined stationary equilibrium.
In the existing standard SLTP, all the players are given a prior belief and solve a Markov decision problem (MDP) to obtain their optimal strategy. The starting point of the standard SLTP is the combination of the solutions to those MDPs that need to be explicitly computed using additional algorithms, such as the value iteration algorithm and policy iteration algorithm. In this thesis, we formulate an arbitrary starting linear tracing procedure (ASLTP) by extending the range of the homotopy variable $t$ to $[0,2]$. The starting point of the ASLTP is an arbitrary point at $t=2$. Following the smooth path derived from the ASLTP, one can readily attain the starting point for the standard SLTP as $t$ decreases to one, which requires nearly negligible computational time. As $t$ varies between zero and one, the ASLTP essentially becomes identical to the standard SLTP and eventually converges to an SSPE for the stochastic game of interest. The ASLTP also employs the prior beliefs and preserves the equilibrium selection interpretation possessed by the standard SLTP. The design of the arbitrary starting point integrates the whole computation process, and no additional operation is therefore required for obtaining the starting point. Moreover, a slight perturbation to the homotopy system ensures that the ASLTP is applicable to all stochastic games while the standard SLTP only works for generic cases.
Although the SLTP has been proven as a globally convergent method to compute SSPEs, it has not been designed to achieve the highest numerical efficiency. The homotopy path of the SLTP or ASLTP is only piecewise smooth, and one has to devise a smoothing technique to avoid switching between different systems of equations when following the path, which may lead to additional computational costs. In this thesis, we incorporate the idea of interior-point methods into differentiable homotopy methods and develop an interior-point differentiable path-following method (IPM) to compute SSPEs. Similar to the ASLTP, the IPM is applicable to all stochastic games, and one can arbitrarily choose the starting strategy profile. Furthermore, the IPM brings several other advantages over the existing methods. On the one hand, the homotopy path of the IPM is forced to stay in the interior of its domain so that one does not need to switch between different systems of equations to follow the path. Numerical comparisons show that the IPM is more than three times as efficient as the ASLTP. On the other hand, a stochastic game can be reformulated as a mixed complementarity problem and solved by the PATH solver. We employ the IPM and the PATH solver to solve a number of the same stochastic games. The numerical results reveal that, for some stochastic games, the PATH solver may fail to find an SSPE, while the IPM is successful in doing so for all stochastic games, which confirms the reliability and stability of the IPM.
It is known that a multiplicity of equilibrium points frequently occurs in stochastic games, and some of them do not coincide with our intuition about the outcome of a game. To define a more plausible equilibrium notion, we extend the perfectness and properness concepts for strategic games to stochastic games and formulate two strict refinements of stationary equilibrium: perfect stationary equilibrium (PeSE) and proper stationary equilibrium (PrSE). The computation of the refined stationary equilibria so far remains an open question in the literature. To fill this gap, in this thesis, we form several effective and efficient homotopy methods to compute PeSEs and PrSEs. The key idea is to design a continuously differentiable function of the homotopy variable $t$. The function value remains zero when $t$ varies between zero and a small number. A well-chosen transformation of variables used in the proposed methods resolves the conflict between the interior requirement of the homotopy process and the perfectness or properness criterion. Moreover, for the properness scenario, we propose a compact formulation to reduce the number of constraints in the perturbed strategy space, which provides the insights required to compute proper stationary equilibria for practical problems that can be modeled as stochastic games with a large number of actions. Extensive numerical experiments further affirm the effectiveness and efficiency of the proposed methods.