Abstract
Pathway analysis is a cornerstone of system biology. In particular, pathway similarity search plays a key role in establishing structural, functional, and evolutionary relationships between different biological entities. Given a query pathway as well as a database, a pathway similarity search aims to identify novel pathways that are homologous to the query pathway. Unfortunately, the pathway similarity search is computationally inefficient due to the NP-complete graph isomorphism problem. In this current study, we introduce a novel algorithmic framework for pathway similarity search, named PathEmb (Pathway Embedding), which is analogous to the Skip-gram model where each pathway is represented as a "document". PathEmb exploits a second order random walk strategy to explore diverse pathway patterns. All signaling paths traversed from random walks are regarded as "sentences", which are constituted as a "document" afterwards. Then, the "document" pattern for the individual pathway is mapped into a low-dimensional feature space for downstream tasks. Furthermore, PathEmb is a topology-free pathway similarity search algorithm, which is feasible to handle any pathway with arbitrary structure. We have extensively evaluated PathEmb and other cutting-edge methods on three pathway datasets. The experimental results demonstrate that PathEmb outperforms the existing methods in terms of computational efficiency and search accuracy. The source codes of PathEmb are freely available online https://github.com/zhangjiaobxy/PathEmb.
| Original language | English |
|---|---|
| Pages (from-to) | 1329-1335 |
| Journal | IEEE Journal of Biomedical and Health Informatics |
| Volume | 23 |
| Issue number | 3 |
| Online published | 27 Apr 2018 |
| DOIs | |
| Publication status | Published - May 2019 |
Research Keywords
- AdaBoost regression
- Biological information theory
- Biological system modeling
- Databases
- document embedding
- Electronic mail
- feature learning
- Global pathway search
- Informatics
- random walks
- Task analysis
Fingerprint
Dive into the research topics of 'PathEmb: Random Walk based Document Embedding for Global Pathway Similarity Search'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver