Abstract
Background: Precise prediction of epitope presentation on human leukocyte antigen (HLA) molecules is crucial for advancing vaccine development and immunotherapy. Conventional HLA-peptide binding affinity prediction tools often focus on specific alleles and lack a universal approach for comprehensive HLA site analysis. This limitation hinders efficient filtering of invalid peptide segments.
Results: We introduce TransHLA, a pioneering tool designed for epitope prediction across all HLA alleles, integrating Transformer and Residue CNN architectures. TransHLA utilizes the ESM2 large language model for sequence and structure embeddings, achieving high predictive accuracy. For HLA class I, it reaches an accuracy of 84.72% and an AUC of 91.95% on IEDB test data. For HLA class II, it achieves 79.94% accuracy and an AUC of 88.14%. Our case studies using datasets like CEDAR and VDJdb demonstrate that TransHLA surpasses existing models in specificity and sensitivity for identifying immunogenic epitopes and neoepitopes.
Conclusions: TransHLA significantly enhances vaccine design and immunotherapy by efficiently identifying broadly reactive peptides. Our resources, including data and code, are publicly accessible at https://github.com/SkywalkerLuke/TransHLA
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
Results: We introduce TransHLA, a pioneering tool designed for epitope prediction across all HLA alleles, integrating Transformer and Residue CNN architectures. TransHLA utilizes the ESM2 large language model for sequence and structure embeddings, achieving high predictive accuracy. For HLA class I, it reaches an accuracy of 84.72% and an AUC of 91.95% on IEDB test data. For HLA class II, it achieves 79.94% accuracy and an AUC of 88.14%. Our case studies using datasets like CEDAR and VDJdb demonstrate that TransHLA surpasses existing models in specificity and sensitivity for identifying immunogenic epitopes and neoepitopes.
Conclusions: TransHLA significantly enhances vaccine design and immunotherapy by efficiently identifying broadly reactive peptides. Our resources, including data and code, are publicly accessible at https://github.com/SkywalkerLuke/TransHLA
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
Original language | English |
---|---|
Journal | GigaScience |
Volume | 14 |
Online published | 27 Feb 2025 |
DOIs | |
Publication status | Published - 2025 |
Research Keywords
- Epitope Presentation
- Pre-trained language model
- Deep Learning
Publisher's Copyright Statement
- This full text is made available under CC-BY 4.0. https://creativecommons.org/licenses/by/4.0/
Fingerprint
Dive into the research topics of 'TransHLA: A Hybrid Transformer Model for HLA-Presented Epitope Detection'. Together they form a unique fingerprint.Datasets
-
Supporting data for "TransHLA: A Hybrid Transformer Model for HLA-Presented Epitope Detection"
LU, T. (Creator), WANG, X. (Creator), NIE, W. (Creator), HUO, M. (Creator) & LI, S. (Creator), GigaScience Press, 17 Dec 2024
DOI: 10.5524/102633
Dataset