TransHLA : A Hybrid Transformer Model for HLA-Presented Epitope Detection

Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review

View graph of relations

Related Research Unit(s)

Detail(s)

Original languageEnglish
Journal / PublicationGigaScience
Publication statusAccepted/In press/Filed - 21 Jan 2025

Abstract

Background: Precise prediction of epitope presentation on human leukocyte antigen (HLA) molecules is crucial for advancing vaccine development and immunotherapy. Conventional HLA-peptide binding affinity prediction tools often focus on specific alleles and lack a universal approach for comprehensive HLA site analysis. This limitation hinders efficient filtering of invalid peptide segments.
Results: We introduce TransHLA, a pioneering tool designed for epitope prediction across all HLA alleles, integrating Transformer and Residue CNN architectures. TransHLA utilizes the ESM2 large language model for sequence and structure embeddings, achieving high predictive accuracy. For HLA class I, it reaches an accuracy of 84.72% and an AUC of 91.95% on IEDB test data. For HLA class II, it achieves 79.94% accuracy and an AUC of 88.14%. Our case studies using datasets like CEDAR and VDJdb demonstrate that TransHLA surpasses existing models in specificity and sensitivity for identifying immunogenic epitopes and neoepitopes.
Conclusions: TransHLA significantly enhances vaccine design and immunotherapy by efficiently identifying broadly reactive peptides. Our resources, including data and code, are publicly accessible at https://github.com/SkywalkerLuke/TransHLA

The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. 

Research Area(s)

  • Epitope Presentation, Pre-trained language model, Deep Learning