Deep embedded clustering with multiple objectives on scRNA-seq data

Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review

7 Scopus Citations
View graph of relations


Related Research Unit(s)


Original languageEnglish
Article numberbbab090
Journal / PublicationBriefings in Bioinformatics
Issue number5
Online published5 Apr 2021
Publication statusPublished - Sept 2021


In recent years, single-cell RNA sequencing (scRNA-seq) technologies have been widely adopted to interrogate gene expression of individual cells; it brings opportunities to understand the underlying processes in a high-throughput manner. Deep embedded clustering (DEC) was demonstrated successful in high-dimensional sparse scRNA-seq data by joint feature learning and cluster assignment for identifying cell types simultaneously. However, the deep network architecture for embedding clustering is not trivial to optimize. Therefore, we propose an evolutionary multiobjective DEC by synergizing the multiobjective evolutionary optimization to simultaneously evolve the hyperparameters and architectures of DEC in an automatic manner. Firstly, a denoising autoencoder is integrated into the DEC to project the high-dimensional sparse scRNA-seq data into a low-dimensional space. After that, to guide the evolution, three objective functions are formulated to balance the model's generality and clustering performance for robustness. Meanwhile, migration and mutation operators are proposed to optimize the objective functions to select the suitable hyperparameters and architectures of DEC in the multiobjective framework. Multiple comparison analyses are conducted on twenty synthetic data and eight real data from different representative single-cell sequencing platforms to validate the effectiveness. The experimental results reveal that the proposed algorithm outperforms other state-of-the-art clustering methods under different metrics. Meanwhile, marker genes identification, gene ontology enrichment and pathology analysis are conducted to reveal novel insights into the cell type identification and characterization mechanisms.