GPRED-GC : A Gene PREDiction model accounting for 5 ′- 3′ GC gradient
Research output: Journal Publications and Reviews (RGC: 21, 22, 62) › 21_Publication in refereed journal › peer-review
Author(s)
Related Research Unit(s)
Detail(s)
Original language | English |
---|---|
Article number | 482 |
Journal / Publication | BMC Bioinformatics |
Volume | 20 |
Issue number | Suppl 15 |
Online published | 24 Dec 2019 |
Publication status | Published - 2019 |
Conference
Title | 14th International Symposium on Bioinformatics Research and Applications (ISBRA’18) |
---|---|
Place | China |
City | Beijing |
Period | 8 - 11 June 2018 |
Link(s)
DOI | DOI |
---|---|
Attachment(s) | Documents
Publisher's Copyright Statement
|
Link to Scopus | https://www.scopus.com/record/display.uri?eid=2-s2.0-85077127673&origin=recordpage |
Permanent Link | https://scholars.cityu.edu.hk/en/publications/publication(6957171a-fa05-4b65-bf12-2ea18bde7794).html |
Abstract
Background: Gene is a key step in genome annotation. Ab initio gene prediction enables gene annotation of new genomes regardless of availability of homologous sequences. There exist a number of ab initio gene prediction tools and they have been widely used for gene annotation for various species. However, existing tools are not optimized for identifying genes with highly variable GC content. In addition, some genes in grass genomes exhibit a sharp 5 ′- 3′ decreasing GC content gradient, which is not carefully modeled by available gene prediction tools. Thus, there is still room to improve the sensitivity and accuracy for predicting genes with GC gradients.
Results: In this work, we designed and implemented a new hidden Markov model (HMM)-based ab initio gene prediction tool, which is optimized for finding genes with highly variable GC contents, such as the genes with negative GC gradients in grass genomes. We tested the tool on three datasets from Arabidopsis thaliana and Oryza sativa. The results showed that our tool can identify genes missed by existing tools due to the highly variable GC contents.
Conclusions: GPRED-GC can effectively predict genes with highly variable GC contents without manual intervention. It provides a useful complementary tool to existing ones such as Augustus for more sensitive gene discovery. The source code is freely available at https://sourceforge.net/projects/gpred-gc/.
Results: In this work, we designed and implemented a new hidden Markov model (HMM)-based ab initio gene prediction tool, which is optimized for finding genes with highly variable GC contents, such as the genes with negative GC gradients in grass genomes. We tested the tool on three datasets from Arabidopsis thaliana and Oryza sativa. The results showed that our tool can identify genes missed by existing tools due to the highly variable GC contents.
Conclusions: GPRED-GC can effectively predict genes with highly variable GC contents without manual intervention. It provides a useful complementary tool to existing ones such as Augustus for more sensitive gene discovery. The source code is freely available at https://sourceforge.net/projects/gpred-gc/.
Research Area(s)
- GC contents, Gene finding, Grass genomes, Hidden Markov model, Plant genome gene prediction
Citation Format(s)
GPRED-GC : A Gene PREDiction model accounting for 5 ′- 3′ GC gradient. / Techa-Angkoon, Prapaporn; Childs, Kevin L.; Sun, Yanni.
In: BMC Bioinformatics, Vol. 20, No. Suppl 15, 482, 2019.Research output: Journal Publications and Reviews (RGC: 21, 22, 62) › 21_Publication in refereed journal › peer-review
Download Statistics
No data available