Abstract
Protein kinases are pivotal regulators of cellular signaling, and their genetic variations are frequently implicated in diseases. Although numerous kinase mutations have been identified as drivers of altered activity, with a few successfully targeted therapeutically, the functional impact of most variants remains uncharacterized. To bridge this gap, we curate a comprehensive dataset that contains 2553 experimentally validated kinase activity-related key alterations (KAKAs) from the literature. While many mutations outside canonical functional regions are known to affect kinase activity, systematic methods to predict their functional consequences are lacking. Consequently, we develop a computational method to predict potential KAKAs, leveraging transfer learning on the pre-trained protein language model ProtBert. Our model, termed pKAKA, achieves an impressive AUC score of 0.9593 and outperforms the AlphaMissense benchmark in comparative testing. Systematic analysis of kinase missense mutations underscores the critical role of KAKAs in pathogenesis, with highlights including JAK2 V617F in atherosclerotic cardiovascular disease, LRRK2 G2385R in Parkinson’s disease, EGFR L858R in lung adenocarcinoma, and EGFR G598V in glioma. Overall, this study significantly advances our understanding of how mutations that influence kinase activity contribute to disease mechanisms. © 2026, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, and Genetics Society of China.
| Original language | English |
|---|---|
| Journal | Journal of Genetics and Genomics |
| Online published | 14 Mar 2026 |
| DOIs | |
| Publication status | Online published - 14 Mar 2026 |
Funding
The present study was supported by the National Key R&D Program of China (2021YFA1302100), National Natural Science Foundation of China (32370698, 32100532, 32570802), Young Talents Program of Sun Yat-sen University Cancer Center (YTP-SYSUCC-0029), Medical Scientific Research Foundation of Guangdong Province (A2022054), Key Research Program of Higher Education Institutions in Henan Province (26A310010), and Youth Talent Lifting Project of the Henan Association for Science and Technology (2026HYTP076).
Research Keywords
- Genetic variations
- Kinase activity
- Protein language model
- Transfer learning
- Cancer
- Atherosclerotic cardiovascular disease
- Parkinson's disease
Fingerprint
Dive into the research topics of 'pKAKA: a protein language model for prioritizing kinase-disrupting variants in diseases'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver