Skip to main navigation Skip to search Skip to main content

AutoProteinEngine: A Large Language Model Driven Agent Framework for Multimodal AutoML in Protein Engineering

  • Yungeng Liu (Co-first Author)
  • , Zan Chen (Co-first Author)
  • , Yu Guang Wang
  • , Yiqing Shen*
  • *Corresponding author for this work

Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review

71 Downloads (CityUHK Scholars)

Abstract

Protein engineering is important for biomedical applications, but conventional approaches are often inefficient and resource-intensive. While deep learning (DL) models have shown promise, their training or implementation into protein engineering remains challenging for biologists without specialized computational expertise. To address this gap, we propose AutoProteinEngine (AutoPE), an agent framework that leverages large language models (LLMs) for multimodal automated machine learning (AutoML) for protein engineering. AutoPE innovatively allows biologists without DL backgrounds to interact with DL models using natural language, lowering the entry barrier for protein engineering tasks. Our AutoPE uniquely integrates LLMs with AutoML to handle model selection for both protein sequence and graph modalities, automatic hyperparameter optimization, and automated data retrieval from protein databases. We evaluated AutoPE through two real-world protein engineering tasks, demonstrating substantial performance improvements compared to traditional zero-shot and manual fine-tuning approaches. By bridging the gap between DL and biologists’ domain expertise, AutoPE empowers researchers to leverage DL without extensive programming knowledge. © 2025 Association for Computational Linguistics.
Original languageEnglish
Title of host publicationProceedings of the 31st International Conference on Computational Linguistics
Subtitle of host publicationIndustry Track
PublisherAssociation for Computational Linguistics
Pages422-430
ISBN (Print)9798891761971
Publication statusPublished - Jan 2025
Externally publishedYes
Event31st International Conference on Computational Linguistics (COLING 2025) - Abu Dhabi, United Arab Emirates
Duration: 19 Jan 202524 Jan 2025
https://aclanthology.org/volumes/2025.coling-main/

Publication series

NameProceedings - International Conference on Computational Linguistics, COLING
ISSN (Print)2951-2093

Conference

Conference31st International Conference on Computational Linguistics (COLING 2025)
PlaceUnited Arab Emirates
CityAbu Dhabi
Period19/01/2524/01/25
Internet address

Publisher's Copyright Statement

  • This full text is made available under CC-BY 4.0. https://creativecommons.org/licenses/by/4.0/

Fingerprint

Dive into the research topics of 'AutoProteinEngine: A Large Language Model Driven Agent Framework for Multimodal AutoML in Protein Engineering'. Together they form a unique fingerprint.

Cite this