Distribution-Agnostic Deep Learning Enables Accurate Single-Cell Data Recovery and Transcriptional Regulation Interpretation

Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review

3 Scopus Citations
View graph of relations

Author(s)

Related Research Unit(s)

Detail(s)

Original languageEnglish
Article number2307280
Journal / PublicationAdvanced Science
Volume11
Issue number16
Online published21 Feb 2024
Publication statusPublished - 24 Apr 2024

Link(s)

Abstract

Single-cell RNA sequencing (scRNA-seq) is a robust method for studying gene expression at the single-cell level, but accurately quantifying genetic material is often hindered by limited mRNA capture, resulting in many missing expression values. Existing imputation methods rely on strict data assumptions, limiting their broader application, and lack reliable supervision, leading to biased signal recovery. To address these challenges, authors developed Bis, a distribution-agnostic deep learning model for accurately recovering missing sing-cell gene expression from multiple platforms. Bis is an optimal transport-based autoencoder model that can capture the intricate distribution of scRNA-seq data while addressing the characteristic sparsity by regularizing the cellular embedding space. Additionally, they propose a module using bulk RNA-seq data to guide reconstruction and ensure expression consistency. Experimental results show Bis outperforms other models across simulated and real datasets, showcasing superiority in various downstream analyses including batch effect removal, clustering, differential expression analysis, and trajectory inference. Moreover, Bis successfully restores gene expression levels in rare cell subsets in a tumor-matched peripheral blood dataset, revealing developmental characteristics of cytokine-induced natural killer cells within a head and neck squamous cell carcinoma microenvironment. © 2024 The Authors.

Research Area(s)

  • imputation, optimal transport, single-cell RNA sequencing

Download Statistics

No data available