Coding genomes with gapped pattern graph convolutional network

Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review

1 Scopus Citations
View graph of relations

Author(s)

Detail(s)

Original languageEnglish
Article numberbtae188
Journal / PublicationBioinformatics
Volume40
Issue number4
Online published11 Apr 2024
Publication statusPublished - Apr 2024

Link(s)

Abstract

Motivation: Genome sequencing technologies reveal a huge amount of genomic sequences. Neural network-based methods can be prime candidates for retrieving insights from these sequences because of their applicability to large and diverse datasets. However, the highly variable lengths of genome sequences severely impair the presentation of sequences as input to the neural network. Genetic variations further complicate tasks that involve sequence comparison or alignment. Results: Inspired by the theory and applications of “spaced seeds,” we propose a graph representation of genome sequences called “gapped pattern graph.” These graphs can be transformed through a Graph Convolutional Network to form lower-dimensional embeddings for downstream tasks. On the basis of the gapped pattern graphs, we implemented a neural network model and demonstrated its performance on diverse tasks involving microbe and mammalian genome data. Our method consistently outperformed all the other state-of-the-art methods across various metrics on all tasks, especially for the sequences with limited homology to the training data. In addition, our model was able to identify distinct gapped pattern signatures from the sequences. © The Author(s) 2024. Published by Oxford University Press.

Research Area(s)

Bibliographic Note

Research Unit(s) information for this publication is provided by the author(s) concerned.

Citation Format(s)

Coding genomes with gapped pattern graph convolutional network. / Wang, Ruo Han; Ng, Yen Kaow; Zhang, Xianglilan et al.
In: Bioinformatics, Vol. 40, No. 4, btae188, 04.2024.

Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review

Download Statistics

No data available