Automated Identification and Representation of System Requirements Based on Large Language Models and Knowledge Graphs

Lei Wang*, Ming-Chao Wang, Yuan-Rong Zhang, Jian Ma, Hong-Yu Shao*, Zhi-Xing Chang

*Corresponding author for this work

Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review

16 Downloads (CityUHK Scholars)

Abstract

In the product design and manufacturing process, the effective management and representation of system requirements (SRs) are crucial for ensuring product quality and consistency. However, current methods are hindered by document ambiguity, weak requirement interdependencies, and limited semantic expressiveness in model-based systems engineering. To address these challenges, this paper proposes a prompt-driven integrated framework that synergizes large language models (LLMs) and knowledge graphs (KGs) to automate the visualization of SR text and structured knowledge extraction. Specifically, this paper introduces a template for information extraction tailored to arbitrary requirement documents, designed around five SysML-defined SR categories: functional requirements, interface requirements, performance requirements, physical requirements, and design constraints. By defining structured elements for each category and leveraging the GPT-4 model to extract key information from unstructured texts, the system can effectively extract and present the structured requirement information. Furthermore, the system constructs a knowledge graph to represent system requirements, visually illustrating the interdependencies and constraints between them. A case study applying this approach to Chapters 18–22 of the ‘Code for Design of Metro’ demonstrates the effectiveness of the proposed method in automating requirement representation, enhancing requirement traceability, and improving management. Moreover, a comparison of information extraction accuracy between GPT-4, GPT-3.5-turbo, BERT, and RoBERTa using the same dataset reveals that GPT-4 achieves an overall extraction accuracy of 84.76% compared to 79.05% for GPT-3.5-turbo and 59.05% for both BERT and RoBERTa. This proves the effectiveness of the proposed method in information extraction and provides a new technical pathway for intelligent requirement management. © 2025 by the authors.
Original languageEnglish
Article number3502
JournalApplied Sciences (Switzerland)
Volume15
Issue number7
Online published23 Mar 2025
DOIs
Publication statusPublished - Apr 2025

Research Keywords

  • information extraction
  • knowledge graphs
  • large language models
  • requirement representation

Publisher's Copyright Statement

  • This full text is made available under CC-BY 4.0. https://creativecommons.org/licenses/by/4.0/

Fingerprint

Dive into the research topics of 'Automated Identification and Representation of System Requirements Based on Large Language Models and Knowledge Graphs'. Together they form a unique fingerprint.

Cite this