Probabilistic tensor decomposition extracts better latent embeddings from single-cell multiomic data

Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review

4 Scopus Citations
View graph of relations

Detail(s)

Original languageEnglish
Article numbere81
Journal / PublicationNucleic acids research
Volume51
Issue number15
Online published5 Jul 2023
Publication statusPublished - 25 Aug 2023

Link(s)

Abstract

Single-cell sequencing technology enables the simultaneous capture of multiomic data from multiple cells. The captured data can be represented by tensors, i.e. the higher-rank matrices. However, the existing analysis tools often take the data as a collection of two-order matrices, renouncing the correspondences among the features. Consequently, we propose a probabilistic tensor decomposition framework, SCOIT, to extract embeddings from single-cell multiomic data. SCOIT incorporates various distributions, including Gaussian, Poisson, and negative binomial distributions, to deal with sparse, noisy, and heterogeneous single-cell data. Our framework can decompose a multiomic tensor into a cell embedding matrix, a gene embedding matrix, and an omic embedding matrix, allowing for various downstream analyses. We applied SCOIT to eight single-cell multiomic datasets from different sequencing protocols. With cell embeddings, SCOIT achieves superior performance for cell clustering compared to nine state-of-the-art tools under various metrics, demonstrating its ability to dissect cellular heterogeneity. With the gene embeddings, SCOIT enables cross-omics gene expression analysis and integrative gene regulatory network study. Furthermore, the embeddings allow cross-omics imputation simultaneously, outperforming current imputation methods with the Pearson correlation coefficient increased by 3.38-39.26%; moreover, SCOIT accommodates the scenario that subsets of the cells are with merely one omic profile available. © The Author(s) 2023. Published by Oxford University Press on behalf of Nucleic Acids Research.

Research Area(s)

Bibliographic Note

Research Unit(s) information for this publication is provided by the author(s) concerned.

Download Statistics

No data available