Abstract
Due to the development of high-throughput technologies for gene analysis, the biclustering method has attracted much attention. However, existing methods have problems with high time and space complexity. This paper proposes a biclustering method, called Row and Column Structure-based Biclustering (RCSBC), with low time and space complexity to find checkerboard patterns within microarray data. First, the paper describes the structure of bicluster by using the structure of rows and columns. Second, the paper chooses the representative rows and columns with two algorithms. Finally, the gene expression data are biclustered on the space spanned by representative rows and columns. To the best of our knowledge, this paper is the first to exploit the relationship between the row/column structure of a gene expression matrix and the structure of biclusters. Both the synthetic datasets and the real-life gene expression datasets are used to validate the effectiveness of our method. It can be seen from the experiment results that the RCSBC outperforms the state-of-the-art algorithms both on clustering accuracy and time/space complexity. This study offers new insights into biclustering the large-scale gene expression data without loading the whole data into memory.
| Original language | English |
|---|---|
| Pages (from-to) | 1117-1129 |
| Journal | IEEE/ACM Transactions on Computational Biology and Bioinformatics |
| Volume | 19 |
| Issue number | 2 |
| Online published | 7 Sept 2020 |
| DOIs | |
| Publication status | Published - Mar 2022 |
Research Keywords
- Biclustering
- checkerboard pattern
- row and column selection
RGC Funding Information
- RGC-funded
Fingerprint
Dive into the research topics of 'Row and Column Structure-Based Biclustering for Gene Expression Data'. Together they form a unique fingerprint.Projects
- 2 Finished
-
GRF: Investigation of EGFR Inter-domain Relations and Their Roles in Lung Cancer Drug Resistance
YAN, H. (Principal Investigator / Project Coordinator)
1/01/19 → 9/06/23
Project: Research
-
CRF: Efficient Algorithms and Hardware Accelerators for Tensor Decomposition and Their Applications to Multidimensional Data Analysis
YAN, H. (Principal Investigator / Project Coordinator), CHEUNG, C. C. R. (Co-Principal Investigator), CHAN, R. H. F. (Co-Investigator), LEE, V. H. F. (Co-Investigator), NG, M. K. P. (Co-Investigator) & QI, L. (Co-Investigator)
1/06/16 → 9/11/20
Project: Research
Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver