Large language model for interpreting research policy using adaptive two-stage retrieval augmented fine-tuning method

Runtao Ren*, Jian Ma, Zhimin Zheng

*Corresponding author for this work

Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review

1 Citation (Scopus)
14 Downloads (CityUHK Scholars)

Abstract

Accurate interpretation of scientific funding policies is crucial for government funding agencies and research institutions to make informed decisions and allocate research funds effectively. However, current large language model (LLM)-based systems often generate responses without references, leading to a lack of interpretability needed for policy enforcement. This study introduces the Adaptive Two-stage Retrieval Augmented Fine-Tuning (AT-RAFT) method, a novel LLM-based approach specifically designed for science policy interpretation. AT-RAFT incorporates three complementary artifacts: a two-stage retrieval mechanism, adaptive hard-negative fine-tuning, and an interpretable response interface. It is trained directly on policy documents, allowing the model to provide reference answers based on retrieved text while also offering the original policy context to enhance interpretability. Our experiments demonstrate that AT-RAFT improves retrieval accuracy by 48% and generation performance by 44% compared to existing baseline systems, effectively supporting real-world decision-making tasks for stakeholders in research institutions and funding agencies. Our proposed method has been adopted by ScholarMate, the largest professional research social networking platform in China, and is now deployed on their platform, providing global users with access to advanced policy interpretation tools. Additionally, a demo version of the instantiated interface is available at https://github.com/renruntao/ResearchPolicy_RAG. © 2025 The Author(s).
Original languageEnglish
Article number127330
JournalExpert Systems with Applications
Volume278
Online published27 Mar 2025
DOIs
Publication statusPublished - 10 Jun 2025

Research Keywords

  • Fine-tuning
  • Generative AI
  • Interpretability
  • Large Language Model
  • Retrieval-augmented Generation

Publisher's Copyright Statement

  • This full text is made available under CC-BY-NC 4.0. https://creativecommons.org/licenses/by-nc/4.0/

Fingerprint

Dive into the research topics of 'Large language model for interpreting research policy using adaptive two-stage retrieval augmented fine-tuning method'. Together they form a unique fingerprint.

Cite this