Skip to main navigation Skip to search Skip to main content

Classification of short human exons and introns based on statistical features

Yonghui Wu, Alan Wee-Chung Liew, Hong Yan, Mengsu Yang

    Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review

    Abstract

    The classification of human gene sequences into exons and introns is a difficult problem in DNA sequence analysis. In this paper, we define a set of features, called the simple Z (SZ) features, which is derived from the Z-curve features for the recognition of human exons and introns. The classification results show that SZ features, while fewer in numbers (three in total), can preserve the high recognition rate of the original nine Z-curve features. Since the size of SZ features is one-third of the Z-curve features, the dimensionality of the feature space is much smaller, and better recognition efficiency is achieved. If the stop codon feature is used together with the three SZ features, a recognition rate of up to 92% for short sequences of length < 140 bp can be obtained.
    Original languageEnglish
    Article number061916
    JournalPhysical Review E
    Volume67
    Issue number6
    DOIs
    Publication statusPublished - Jun 2003

    Fingerprint

    Dive into the research topics of 'Classification of short human exons and introns based on statistical features'. Together they form a unique fingerprint.

    Cite this