scBGEDA : deep single-cell clustering analysis via a dual denoising autoencoder with bipartite graph ensemble clustering
Research output: Journal Publications and Reviews (RGC: 21, 22, 62) › 21_Publication in refereed journal › peer-review
Author(s)
Related Research Unit(s)
Detail(s)
Original language | English |
---|---|
Article number | btad075 |
Journal / Publication | Bioinformatics |
Volume | 39 |
Issue number | 2 |
Online published | 3 Feb 2023 |
Publication status | Published - Feb 2023 |
Link(s)
DOI | DOI |
---|---|
Attachment(s) | Documents
Publisher's Copyright Statement
|
Link to Scopus | https://www.scopus.com/record/display.uri?eid=2-s2.0-85148112935&origin=recordpage |
Permanent Link | https://scholars.cityu.edu.hk/en/publications/publication(d25a99a5-2930-46de-af6c-461431e120be).html |
Abstract
Motivation: Single-cell RNA sequencing (scRNA-seq) is an increasingly popular technique for transcriptomic analysis of gene expression at the single-cell level. Cell-type clustering is the first crucial task in the analysis of scRNA-seq data that facilitates accurate identification of cell types and the study of the characteristics of their transcripts. Recently, several computational models based on a deep autoencoder and the ensemble clustering have been developed to analyze scRNA-seq data. However, current deep autoencoders are not sufficient to learn the latent representations of scRNA-seq data, and obtaining consensus partitions from these feature representations remains under-explored. Results: To address this challenge, we propose a single-cell deep clustering model via a dual denoising autoencoder with bipartite graph ensemble clustering called scBGEDA, to identify specific cell populations in single-cell transcriptome profiles. First, a single-cell dual denoising autoencoder network is proposed to project the data into a compressed low-dimensional space and that can learn feature representation via explicit modeling of synergistic optimization of the zero-inflated negative binomial reconstruction loss and denoising reconstruction loss. Then, a bipartite graph ensemble clustering algorithm is designed to exploit the relationships between cells and the learned latent embedded space by means of a graph-based consensus function. Multiple comparison experiments were conducted on 20 scRNA-seq datasets from different sequencing platforms using a variety of clustering metrics. The experimental results indicated that scBGEDA outperforms other state-of-the-art methods on these datasets, and also demonstrated its scalability to large-scale scRNA-seq datasets. Moreover, scBGEDA was able to identify cell-type specific marker genes and provide functional genomic analysis by quantifying the influence of genes on cell clusters, bringing new insights into identifying cell types and characterizing the scRNA-seq data from different perspectives. Availability and implementation: The source code of scBGEDA is available at https://github.com/wangyh082/scBGEDA. The software and the supporting data can be downloaded from https://figshare.com/articles/software/scBGEDA/19657911. Supplementary information: Supplementary data are available at Bioinformatics online. © The Author(s) 2023. Published by Oxford University Press.
Research Area(s)
Citation Format(s)
scBGEDA : deep single-cell clustering analysis via a dual denoising autoencoder with bipartite graph ensemble clustering. / Wang, Yunhe; Yu, Zhuohan; Li, Shaochuan et al.
In: Bioinformatics, Vol. 39, No. 2, btad075, 02.2023.Research output: Journal Publications and Reviews (RGC: 21, 22, 62) › 21_Publication in refereed journal › peer-review
Download Statistics
No data available