SCS : Signal, context, and structure features for genome-wide human promoter recognition

Research output: Journal Publications and Reviews (RGC: 21, 22, 62)21_Publication in refereed journalpeer-review

20 Scopus Citations
View graph of relations


  • Jia Zeng
  • Xiao-Yu Zhao
  • Xiao-Qin Cao
  • Hong Yan


Original languageEnglish
Article number4609377
Pages (from-to)550-562
Journal / PublicationIEEE/ACM Transactions on Computational Biology and Bioinformatics
Issue number3
Publication statusPublished - 2010


This paper integrates the signal, context, and structure features for genome-wide human promoter recognition, which is important in improving genome annotation and analyzing transcriptional regulation without experimental supports of ESTs, cDNAs, or mRNAs. First, CpG islands are salient biological signals associated with approximately 50 percent of mammalian promoters. Second, the genomic context of promoters may have biological significance, which is based on n-mers (sequences of n bases long) and their statistics estimated from training samples. Third, sequence-dependent DNA flexibility originates from DNA 3D structures and plays an important role in guiding transcription factors to the target site in promoters. Employing decision trees, we combine above signal, context, and structure features to build a hierarchical promoter recognition system called SCS. Experimental results on controlled data sets and the entire human genome demonstrate that SCS is significantly superior in terms of sensitivity and specificity as compared to other state-of-the-art methods. The SCS promoter recognition system is available online as supplemental materials for academic use and can be found on the Computer Society Digital Library at © 2006 IEEE.

Research Area(s)

  • Biology and genetics, classifier combination, feature extraction, genome analysis, Pattern Recognition, Promoter recognition