scBGEDA : deep single-cell clustering analysis via a dual denoising autoencoder with bipartite graph ensemble clustering

Research output: Journal Publications and Reviews (RGC: 21, 22, 62)21_Publication in refereed journalpeer-review

View graph of relations

Author(s)

  • Yunhe Wang
  • Zhuohan Yu
  • Shaochuan Li
  • Chuang Bian
  • Yanchun Liang
  • Xiangtao Li

Related Research Unit(s)

Detail(s)

Original languageEnglish
Article numberbtad075
Journal / PublicationBioinformatics
Volume39
Issue number2
Online published3 Feb 2023
Publication statusPublished - Feb 2023

Link(s)

Abstract

Motivation: Single-cell RNA sequencing (scRNA-seq) is an increasingly popular technique for transcriptomic analysis of gene expression at the single-cell level. Cell-type clustering is the first crucial task in the analysis of scRNA-seq data that facilitates accurate identification of cell types and the study of the characteristics of their transcripts. Recently, several computational models based on a deep autoencoder and the ensemble clustering have been developed to analyze scRNA-seq data. However, current deep autoencoders are not sufficient to learn the latent representations of scRNA-seq data, and obtaining consensus partitions from these feature representations remains under-explored. Results: To address this challenge, we propose a single-cell deep clustering model via a dual denoising autoencoder with bipartite graph ensemble clustering called scBGEDA, to identify specific cell populations in single-cell transcriptome profiles. First, a single-cell dual denoising autoencoder network is proposed to project the data into a compressed low-dimensional space and that can learn feature representation via explicit modeling of synergistic optimization of the zero-inflated negative binomial reconstruction loss and denoising reconstruction loss. Then, a bipartite graph ensemble clustering algorithm is designed to exploit the relationships between cells and the learned latent embedded space by means of a graph-based consensus function. Multiple comparison experiments were conducted on 20 scRNA-seq datasets from different sequencing platforms using a variety of clustering metrics. The experimental results indicated that scBGEDA outperforms other state-of-the-art methods on these datasets, and also demonstrated its scalability to large-scale scRNA-seq datasets. Moreover, scBGEDA was able to identify cell-type specific marker genes and provide functional genomic analysis by quantifying the influence of genes on cell clusters, bringing new insights into identifying cell types and characterizing the scRNA-seq data from different perspectives. Availability and implementation: The source code of scBGEDA is available at https://github.com/wangyh082/scBGEDA. The software and the supporting data can be downloaded from https://figshare.com/articles/software/scBGEDA/19657911. Supplementary information: Supplementary data are available at Bioinformatics online. © The Author(s) 2023. Published by Oxford University Press.

Research Area(s)

Citation Format(s)

scBGEDA : deep single-cell clustering analysis via a dual denoising autoencoder with bipartite graph ensemble clustering. / Wang, Yunhe; Yu, Zhuohan; Li, Shaochuan et al.

In: Bioinformatics, Vol. 39, No. 2, btad075, 02.2023.

Research output: Journal Publications and Reviews (RGC: 21, 22, 62)21_Publication in refereed journalpeer-review

Download Statistics

No data available