TY - GEN
T1 - Approximation algorithms for Bi-clustering problems
AU - Wang, Lusheng
AU - Lin, Yu
AU - Liu, Xiaowen
PY - 2006
Y1 - 2006
N2 - One of the main goals in the analysis of microarray data is to identify groups of genes and groups of experimental conditions (including environments, individuals and tissues), that exhibit similar expression patterns. This is the so-called bi-clustering problem. In this paper, we consider two variations of the bi-clustering problem: the Consensus Submatrix Problem and the Bottleneck Submatrix Problem. The input of the problems contains a m × n matrix A and integers l and k. The Consensus Submatrix Problem is to find a l × k submatrix with l < m and k < n and a consensus vector such that the sum of distance between all rows in the submatrix and the vector is minimized. The Bottleneck Submatrix Problem is to find a l × k submatrix with l < m and k < n, an integer d and a center vector such that the distance between every row in the submatrix and the vector is at most d and d is minimized. We show that both problems are NP-hard and give randomized approximation algorithms for special cases of the two problems. Using standard techniques, we can derandomize the algorithms to get polynomial time approximation schemes for the two problems. To our knowledge, this is the first time that approximation algorithms with guaranteed ratio are presented for microarray analysis. © Springer-Verlag Berlin Heidelberg 2006.
AB - One of the main goals in the analysis of microarray data is to identify groups of genes and groups of experimental conditions (including environments, individuals and tissues), that exhibit similar expression patterns. This is the so-called bi-clustering problem. In this paper, we consider two variations of the bi-clustering problem: the Consensus Submatrix Problem and the Bottleneck Submatrix Problem. The input of the problems contains a m × n matrix A and integers l and k. The Consensus Submatrix Problem is to find a l × k submatrix with l < m and k < n and a consensus vector such that the sum of distance between all rows in the submatrix and the vector is minimized. The Bottleneck Submatrix Problem is to find a l × k submatrix with l < m and k < n, an integer d and a center vector such that the distance between every row in the submatrix and the vector is at most d and d is minimized. We show that both problems are NP-hard and give randomized approximation algorithms for special cases of the two problems. Using standard techniques, we can derandomize the algorithms to get polynomial time approximation schemes for the two problems. To our knowledge, this is the first time that approximation algorithms with guaranteed ratio are presented for microarray analysis. © Springer-Verlag Berlin Heidelberg 2006.
UR - https://www.scopus.com/pages/publications/33750225843
UR - https://www.scopus.com/record/pubmetrics.uri?eid=2-s2.0-33750225843&origin=recordpage
U2 - 10.1007/11851561_29
DO - 10.1007/11851561_29
M3 - RGC 32 - Refereed conference paper (with host publication)
SN - 9783540395836
T3 - Lecture Notes in Computer Science
SP - 310
EP - 320
BT - Algorithms in Bioinformatics
A2 - Bücher, Philipp
A2 - Moret, Bernard M. E.
PB - Springer
CY - Berlin, Heidelberg
T2 - 6th International Workshop on Algorithms in Bioinformatics (WABI 2006)
Y2 - 11 September 2006 through 13 September 2006
ER -