Efficient Learning of Causal Networks
因果網絡圖的快速學習方法
Student thesis: Doctoral Thesis
Author(s)
Related Research Unit(s)
Detail(s)
Awarding Institution | |
---|---|
Supervisors/Advisors |
|
Award date | 5 Oct 2023 |
Link(s)
Permanent Link | https://scholars.cityu.edu.hk/en/theses/theses(a6c1fd26-ba30-4fcf-8a9a-059cdfa038df).html |
---|---|
Other link(s) | Links |
Abstract
Causal networks, named by directed acyclic graphs (DAG), provide the graphical representation to detect the causal relationships from observational data, which has great advantages over interventions or randomized experiments, and attracts much attention in many real applications, including economics, environmental modelling, computer science, and biology. Although much progress has been made to learn the causal structures based on the graphical modelling during the last two decades, challenges arising from the existing approaches remain persistently unsolved. First, the identifiability issue of directed acyclic graphs heavily depends on the underdeterministic topological ordering, which leads to low efficiency and precision. Second, the structure learning methods are faced with difficulties arising from the large search space, which requires extensive computation and suffers from the curse of dimensionality. Therefore, the statistical learning of causal networks considerably call for efficient approaches.
In the first part of this thesis, we study a special class of non-Gaussian DAG models, where the conditional variance of each node given its parents is a quadratic function of its conditional mean. Such a class of non-Gaussian DAG models are fairly flexible and admit many popular distributions as special cases, including Poisson, Binomial, Geometric, Exponential, and Gamma. To facilitate learning, we introduce a novel concept of topological layers, which ensures that descendants are located in the lower layers behind ancestors and thus guarantees acyclicity. We develop an efficient DAG learning algorithm, where it first reconstructs the topological layers in a hierarchical fashion based on the proposed ratio-based measures and then recovers the directed edges between nodes in different layers, which requires much less computational cost than most existing algorithms in literature. Its superior performance and high efficiency are also demonstrated in a number of simulated examples, as well as the applications to two real-life datasets, including an NBA player statistics data and a cosmetic sales data collected by Alibaba. We also establish the theoretical results that the proposed method can consistently recover the underlying DAG under mild conditions.
In the second part of this thesis, we propose an efficient method to learn linear heavy-tailed DAGs where the number of nodes diverges as the sample size increases at some certain rate. The proposed method leverages the concept of topological layers to facilitate learning in a two-step algorithm where we first reconstruct the topological layers hierarchically in a top-down fashion and recover the directed edges via modified conditional independent tests for heavy-tailed distributions. The theoretical results and extensive simulations show that the proposed algorithm is statistically consistent in the moderate dimensional case. Finally, we demonstrate our proposed method through the financial exchange rates data example, and then discover the source and pathways of financial contagions for risk diversification.
In the end, we summarize our contributions and discuss some future work. Some possible research directions include generalizing the current model settings to unmeasured confounders and cyclic graphs.
In the first part of this thesis, we study a special class of non-Gaussian DAG models, where the conditional variance of each node given its parents is a quadratic function of its conditional mean. Such a class of non-Gaussian DAG models are fairly flexible and admit many popular distributions as special cases, including Poisson, Binomial, Geometric, Exponential, and Gamma. To facilitate learning, we introduce a novel concept of topological layers, which ensures that descendants are located in the lower layers behind ancestors and thus guarantees acyclicity. We develop an efficient DAG learning algorithm, where it first reconstructs the topological layers in a hierarchical fashion based on the proposed ratio-based measures and then recovers the directed edges between nodes in different layers, which requires much less computational cost than most existing algorithms in literature. Its superior performance and high efficiency are also demonstrated in a number of simulated examples, as well as the applications to two real-life datasets, including an NBA player statistics data and a cosmetic sales data collected by Alibaba. We also establish the theoretical results that the proposed method can consistently recover the underlying DAG under mild conditions.
In the second part of this thesis, we propose an efficient method to learn linear heavy-tailed DAGs where the number of nodes diverges as the sample size increases at some certain rate. The proposed method leverages the concept of topological layers to facilitate learning in a two-step algorithm where we first reconstruct the topological layers hierarchically in a top-down fashion and recover the directed edges via modified conditional independent tests for heavy-tailed distributions. The theoretical results and extensive simulations show that the proposed algorithm is statistically consistent in the moderate dimensional case. Finally, we demonstrate our proposed method through the financial exchange rates data example, and then discover the source and pathways of financial contagions for risk diversification.
In the end, we summarize our contributions and discuss some future work. Some possible research directions include generalizing the current model settings to unmeasured confounders and cyclic graphs.
- Causality, Directed acyclic graph, Identifiablity, Topological layers, Structural equation model