Structured Sparse Learning and Its Applications
結構性稀疏模型及其應用的研究
Student thesis: Doctoral Thesis
Author(s)
Related Research Unit(s)
Detail(s)
Awarding Institution | |
---|---|
Supervisors/Advisors |
|
Award date | 22 Mar 2018 |
Link(s)
Permanent Link | https://scholars.cityu.edu.hk/en/theses/theses(b47bb29b-a252-4a5c-82d9-c10c5f60a817).html |
---|---|
Other link(s) | Links |
Abstract
Structured sparsity has become an important topic in machine learning and statistical learning. In some real-world applications, structured sparse learning algorithms are able to obtain better generalization learning performance by selecting important factors and discovering structure information among data. This thesis develops structured sparsity models to address some real-world applications by exploring structural information within data. More specifically, we develop three structured sparse learning algorithms and address its related application:
1. In some real-world applications, data may include some important correlation
structures. For example, in gene expression data analysis, some of the genes can be divided into different groups based on their biological pathways. To this end, we propose a sparse logistic regression model with the structured penalized regularization for feature selection, which is able to identify the unknown correlation structure within the data. Moreover, we evaluate this model on gene expression data analysis application.
2. Multi-task feature selection methods have become more important for many
real world applications, especially in a high-dimensional setting. In this thesis, we propose a graph-guided approach for multi-task feature selection problems. Unlike the current multi-task feature selection methods, our approach is based on the assumption that only related tasks share similar features, and each task can be associated with a few task-specific features.
3. Most existing multi-task learning methods are based on the assumption that
all tasks are positively correlated, and utilize the shared structures among tasks to improve learning performance. By contrast, there also exist competitive structures (negative relationships) among tasks in some real-world applications, and conventional multi-task learning methods which explore shared structures across tasks may lead to unsatisfactory performance in this setting. The other challenge in some multi-task learning applications, especially in a high dimensional setting, is to exclude irrelevant features (sparse structure) from the final model. For this purpose, we propose a new method, which is referred to as Sparse Exclusive Lasso (SpEL) which can capture the competitive relationship among tasks, while removing unimportant features which are common across tasks from the final model.
1. In some real-world applications, data may include some important correlation
structures. For example, in gene expression data analysis, some of the genes can be divided into different groups based on their biological pathways. To this end, we propose a sparse logistic regression model with the structured penalized regularization for feature selection, which is able to identify the unknown correlation structure within the data. Moreover, we evaluate this model on gene expression data analysis application.
2. Multi-task feature selection methods have become more important for many
real world applications, especially in a high-dimensional setting. In this thesis, we propose a graph-guided approach for multi-task feature selection problems. Unlike the current multi-task feature selection methods, our approach is based on the assumption that only related tasks share similar features, and each task can be associated with a few task-specific features.
3. Most existing multi-task learning methods are based on the assumption that
all tasks are positively correlated, and utilize the shared structures among tasks to improve learning performance. By contrast, there also exist competitive structures (negative relationships) among tasks in some real-world applications, and conventional multi-task learning methods which explore shared structures across tasks may lead to unsatisfactory performance in this setting. The other challenge in some multi-task learning applications, especially in a high dimensional setting, is to exclude irrelevant features (sparse structure) from the final model. For this purpose, we propose a new method, which is referred to as Sparse Exclusive Lasso (SpEL) which can capture the competitive relationship among tasks, while removing unimportant features which are common across tasks from the final model.