Skip to main navigation Skip to search Skip to main content

Vul-R2: A Reasoning LLM for Automated Vulnerability Repair

  • Xin-Cheng Wen
  • , Zirui Lin
  • , Yijun Yang
  • , Cuiyun Gao*
  • , Deheng Ye
  • *Corresponding author for this work

Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review

Abstract

The exponential increase in software vulnerabilities has created an urgent need for automatic vulnerability repair (AVR) solutions. Recent research has formulated AVR as a sequence generation problem and has leveraged large language models (LLMs) to address this problem. Typically, these approaches prompt or fine-tune LLMs to generate repairs for vulnerabilities directly. Although these methods show state-of-the-art performance, they face the following challenges: (1) Lack of high-quality, vulnerability-related reasoning data. Current approaches primarily rely on foundation models that mainly encode general programming knowledge. Without vulnerability-related reasoning data, they tend to fail to capture the diverse vulnerability repair patterns. (2) Hard to verify the intermediate vulnerability repair process during LLM training. Existing reinforcement learning methods often leverage intermediate execution feedback from the environment (e.g., sandbox-based execution results) to guide reinforcement learning training. In contrast, the vulnerability repair process generally lacks such intermediate, verifiable feedback, which poses additional challenges for model training.

To address these challenges, we propose to model the vulnerability repair task from a reasoning perspective and train a reasoning LLM termed Vulnerability Reasoner and Repair (Vul-R2) which consists of two key modules: (1) a domain-aware reasoning learning module, which comprises a reasoning answer construction component, a reasoning data filtering process, and a supervised fine-tuning process for learning vulnerability-related reasoning knowledge; and (2) a curriculum-based verifiable rewarded training module, which comprises dynamically reinforcement learning with verifiable rewards paradigms based on multiple-choice question answering in an easy stage and character-level matching in a hard stage. We evaluate Vul-R2 on the real-world C/C++ dataset PrimeVul to demonstrate its effectiveness in vulnerability repair. Specifically, Vul-R2 outperforms the best baseline by 11.27% for exact match (EM) and successfully repairs 49 additional vulnerabilities. Furthermore, we demonstrate the effectiveness of the proposed paradigm, fine-tuning Vul-R2 on PrimeVul leads to improved EM performance of 8.78% on a human curated dataset SVEN, even without additional training. © 2025 IEEE.
Original languageEnglish
Title of host publicationProceedings - 2025 40th IEEE/ACM International Conference on Automated Software Engineering (ASE 2025)
PublisherIEEE
Pages26-38
Number of pages13
ISBN (Electronic)979-8-3503-5733-2
DOIs
Publication statusPublished - Nov 2025
Event40th IEEE/ACM International Conference on Automated Software Engineering (ASE 2025) - Seoul, Korea, Republic of
Duration: 16 Nov 202520 Nov 2025
https://conf.researchr.org/home/ase-2025

Publication series

NameProceedings - IEEE/ACM International Conference on Automated Software Engineering, ASE

Conference

Conference40th IEEE/ACM International Conference on Automated Software Engineering (ASE 2025)
Abbreviated titleASE'25
PlaceKorea, Republic of
CitySeoul
Period16/11/2520/11/25
Internet address

Funding

This research is supported by the National Natural Science Foundation of China under project (No. 62472126, 62276075), Natural Science Foundation of Guangdong Province (Project No. 2023A1515011959), and Shenzhen-Hong Kong Jointly Funded Project (Category A, No. SGDX20230116 091246007).

Research Keywords

  • Large Language Model
  • Reinforcement Learning
  • Vulnerability Repair

Fingerprint

Dive into the research topics of 'Vul-R2: A Reasoning LLM for Automated Vulnerability Repair'. Together they form a unique fingerprint.

Cite this