Identifying Genetic Risk Factors for Alzheimer's Disease via Shared Tree-Guided Feature Learning Across Multiple Tasks

Weizhong Zhang*, Tingjin Luo, Shuang Qiu, Jieping Ye, Deng Cai, Xiaofei He, Jie Wang

*Corresponding author for this work

Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review

8 Citations (Scopus)

Abstract

The genome-wide association study (GWAS) is a popular approach to identify disease-associated genetic factors for Alzhemer's Disease (AD). However, it remains challenging because of the small number of samples, very high feature dimensionality and complex structures. To accurately identify genetic risk factors for AD, we propose a novel method based on an in-depth exploration of the hierarchical structure among the features and the commonality across related tasks. Specifically, we first extract and encode the tree hierarchy among features; then, we integrate the tree structures with multi-task feature learning (MTFL) to learn the shared features - that are predictive of AD - among related tasks simultaneously. Thus, we can unify the strength of both the prior structure information and MTFL to boost the prediction performance. However, due to the highly complex regularizer that encodes the tree structure and the extremely high feature dimensionality, the learning process can be computationally prohibitive. To address this, we further develop a novel safe screening rule to quickly identify and remove the irrelevant features before training. Experiment results demonstrate that the proposed approach significantly outperforms the state-of-the-art in detecting genetic risk factors of AD and the speedup gained by the proposed screening can be several orders of magnitude. © 1989-2012 IEEE.
Original languageEnglish
Article number8317000
Pages (from-to)2145-2156
JournalIEEE Transactions on Knowledge and Data Engineering
Volume30
Issue number11
DOIs
Publication statusPublished - 1 Nov 2018
Externally publishedYes

Bibliographical note

Publication details (e.g. title, author(s), publication statuses and dates) are captured on an “AS IS” and “AS AVAILABLE” basis at the time of record harvesting from the data source. Suggestions for further amendments or supplementary information can be sent to [email protected].

Research Keywords

  • Alzheimer's disease
  • genome-wide association studies
  • multi-task learning
  • screening
  • Tree-structured group Lasso

Fingerprint

Dive into the research topics of 'Identifying Genetic Risk Factors for Alzheimer's Disease via Shared Tree-Guided Feature Learning Across Multiple Tasks'. Together they form a unique fingerprint.

Cite this