Mutation region detection for closely related individuals without a known pedigree

Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review

5 Scopus Citations
View graph of relations

Author(s)

Related Research Unit(s)

Detail(s)

Original languageEnglish
Article number6051420
Pages (from-to)499-510
Journal / PublicationIEEE/ACM Transactions on Computational Biology and Bioinformatics
Volume9
Issue number2
Publication statusPublished - 2012

Abstract

Linkage analysis serves as a way of finding locations of genes that cause genetic diseases. Linkage studies have facilitated the identification of several hundreds of human genes that can harbor mutations which by themselves lead to a disease phenotype. The fundamental problem in linkage analysis is to identify regions whose allele is shared by all or almost all affected members but by none or few unaffected members. Almost all the existing methods for linkage analysis are for families with clearly given pedigrees. Little work has been done for the case where the sampled individuals are closely related, but their pedigree is not known. This situation occurs very often when the individuals share a common ancestor at least six generations ago. Solving this case will tremendously extend the use of linkage analysis for finding genes that cause genetic diseases. In this paper, we propose a mathematical model (the shared center problem) for inferring the allele-sharing status of a given set of individuals using a database of confirmed haplotypes as reference. We show the NP-completeness of the shared center problem and present a ratio-2 polynomial-time approximation algorithm for its minimization version (called the closest shared center problem). We then convert the approximation algorithm into a heuristic algorithm for the shared center problem. Based on this heuristic, we finally design a heuristic algorithm for mutation region detection. We further implement the algorithms to obtain a software package. Our experimental data show that the software is both fast and accurate. The package is available at http://www.cs.cityu.edu.hk/~lwang/software/LDWP/ for noncommercial use. © 2012 IEEE.

Research Area(s)

  • allele-sharing status, and approximation algorithm, Haplotype inference, linkage analysis, pedigree