Structured Penalized Logistic Regression for Gene Selection in Gene Expression Data Analysis
Research output: Journal Publications and Reviews (RGC: 21, 22, 62) › 21_Publication in refereed journal › peer-review
Author(s)
Related Research Unit(s)
Detail(s)
Original language | English |
---|---|
Pages (from-to) | 312-321 |
Journal / Publication | IEEE/ACM Transactions on Computational Biology and Bioinformatics |
Volume | 16 |
Issue number | 1 |
Online published | 30 Oct 2017 |
Publication status | Published - Feb 2019 |
Link(s)
Abstract
In gene expression data analysis, the problems of cancer classification and gene selection are closely related. Successfully selecting informative genes significantly improve the classification performance. To identify informative genes from a large number of candidate genes, various methods have been proposed. However, the gene expression data may include some important correlation structures, and some of the genes can be divided into different groups based on their biological pathways. Many existing methods do not take into consideration the exact correlation structure within the data. Therefore, from both the knowledge discovery and biological perspectives, an ideal gene selection method should take this structural information into account. Moreover, the better generalization performance can be obtained by discovering correlation structure within data. In order to discover structure information among data and improve learning performance, we propose a structured penalized logistic regression model which simultaneously performs feature selection and model learning for gene expression data analysis. An efficient coordinate descent algorithm has been developed to optimize the model. The numerical simulation studies demonstrate that our method is able to select the highly correlated features. In addition, the results from real gene expression datasets show that the proposed method performs competitively with respect to previous approaches.
Research Area(s)
- Analytical models, Correlation, Data analysis, Data models, Gene expression, Logistics, Microarray, Penalized logistic regression model, Structured penalized regularization
Citation Format(s)
Structured Penalized Logistic Regression for Gene Selection in Gene Expression Data Analysis. / Liu, Cheng; Wong, Hau San.
In: IEEE/ACM Transactions on Computational Biology and Bioinformatics, Vol. 16, No. 1, 02.2019, p. 312-321.
In: IEEE/ACM Transactions on Computational Biology and Bioinformatics, Vol. 16, No. 1, 02.2019, p. 312-321.
Research output: Journal Publications and Reviews (RGC: 21, 22, 62) › 21_Publication in refereed journal › peer-review