TY - JOUR
T1 - Randomized algorithms for motif detection
AU - Wang, Lusheng
AU - Dong, Liang
AU - Fan, Hui
PY - 2004
Y1 - 2004
N2 - Motivation: Motif detection for DNA sequences has many important applications in biological studies, e. g. , locating binding sites and regulatory signals, and designing genetic probes etc. In this paper, we propose a randomized algorithm, design an improved EM algorithm and combine them to form a software. Results: (1) We design a randomized algorithm for consensus pattern problem. We can show that with high probability, our randomized algorithm finds a pattern in polynomial time with cost error at most ε×l for each string, where l is the length of the motif and ε can be any positive number given by the user. (2) We design an improved EM (Expectation Maximization) algorithm that outperforms the original EM algorithm. (3) We develop a software MotifDetector that uses our randomized algorithm to find good seeds and uses the improved EM algorithm to do local search. We compare MotifDetector with Buhler and Tompa's PROJECTION which is considered to be the best known software for motif detection. Simulations show that MotifDetector is slower than PROJECTION when the pattern length is relatively small, and outperforms PROJECTION when the pattern length becomes large. Availability: Free from http://www. cs. cityu. edu. hk/-lwang/software/ motif/index. html, subject to copyright restrictions. © Springer-Verlag 2004.
AB - Motivation: Motif detection for DNA sequences has many important applications in biological studies, e. g. , locating binding sites and regulatory signals, and designing genetic probes etc. In this paper, we propose a randomized algorithm, design an improved EM algorithm and combine them to form a software. Results: (1) We design a randomized algorithm for consensus pattern problem. We can show that with high probability, our randomized algorithm finds a pattern in polynomial time with cost error at most ε×l for each string, where l is the length of the motif and ε can be any positive number given by the user. (2) We design an improved EM (Expectation Maximization) algorithm that outperforms the original EM algorithm. (3) We develop a software MotifDetector that uses our randomized algorithm to find good seeds and uses the improved EM algorithm to do local search. We compare MotifDetector with Buhler and Tompa's PROJECTION which is considered to be the best known software for motif detection. Simulations show that MotifDetector is slower than PROJECTION when the pattern length is relatively small, and outperforms PROJECTION when the pattern length becomes large. Availability: Free from http://www. cs. cityu. edu. hk/-lwang/software/ motif/index. html, subject to copyright restrictions. © Springer-Verlag 2004.
UR - http://www.scopus.com/inward/record.url?scp=35048877825&partnerID=8YFLogxK
UR - https://www.scopus.com/record/pubmetrics.uri?eid=2-s2.0-35048877825&origin=recordpage
M3 - RGC 21 - Publication in refereed journal
SN - 0302-9743
VL - 3341
SP - 884
EP - 895
JO - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
JF - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
ER -