Abstract
Purpose – Achieving smart question-answering (QA) for construction laws (CLs) holds significant promise in aiding domain professionals with legal inquiries. Existing studies of construction law question-answering (CLQA) rely on learning-based models, which require extensive training data and are limited to a narrow QA scope. Meanwhile, general-purpose large language models (GPLLMs) possess great potential for CLQA but fall short of domain-specific knowledge. This study aims to propose a data-driven and expertise-based approach to develop a construction law knowledge repository (CLKR) and validate its effectiveness in enhancing the CLQA performance of GPLLMs.
Design/methodology/approach – This methodology includes (1) recognizing 702 candidate CL documents from 374,992 official judgments, (2) building a CLKR with 387 filtered documents covering eight CL knowledge areas, (3) integrating CLKR and seven representative GPLLMs and (4) constructing a 2,140-question CLQA dataset from Professional Construction Engineer Qualification Examinations (PCEQEs) during 2014–2023 to compare CLQA performance between seven pairs of GPLLMs with and without CLKR.
Findings – The CLKR significantly enhances the CLQA performance of seven GPLLMs, yielding an impressive average accuracy increase of 21.1%, with individual improvements ranging from 9.9 to 44.9%. Furthermore, CLKR boosts the accuracy of single-answer questions by 14.9% and multiple-answer questions by 38.3%. Additionally, the accuracy enhancements across 8 CL knowledge areas are between 14.5 and 28.2%.
Originality/value – This study proposes an approach of developing the external knowledge base of CLKR to empower GPLLMs, significantly expanding the scope of CLQA while bypassing the complex training of traditional learning-based models. Moreover, this study confirms the effectiveness of CLKR in augmenting GPLLM performance and offers a reusable CLQA test dataset as a benchmark.
© 2025 Shenghua Zhou, Hongyu Wang, S. Thomas Ng, Dezhi Li, Shenming Xie, Kaiwen Chen and Wentao Wang. Published by Emerald Publishing Limited.
Design/methodology/approach – This methodology includes (1) recognizing 702 candidate CL documents from 374,992 official judgments, (2) building a CLKR with 387 filtered documents covering eight CL knowledge areas, (3) integrating CLKR and seven representative GPLLMs and (4) constructing a 2,140-question CLQA dataset from Professional Construction Engineer Qualification Examinations (PCEQEs) during 2014–2023 to compare CLQA performance between seven pairs of GPLLMs with and without CLKR.
Findings – The CLKR significantly enhances the CLQA performance of seven GPLLMs, yielding an impressive average accuracy increase of 21.1%, with individual improvements ranging from 9.9 to 44.9%. Furthermore, CLKR boosts the accuracy of single-answer questions by 14.9% and multiple-answer questions by 38.3%. Additionally, the accuracy enhancements across 8 CL knowledge areas are between 14.5 and 28.2%.
Originality/value – This study proposes an approach of developing the external knowledge base of CLKR to empower GPLLMs, significantly expanding the scope of CLQA while bypassing the complex training of traditional learning-based models. Moreover, this study confirms the effectiveness of CLKR in augmenting GPLLM performance and offers a reusable CLQA test dataset as a benchmark.
© 2025 Shenghua Zhou, Hongyu Wang, S. Thomas Ng, Dezhi Li, Shenming Xie, Kaiwen Chen and Wentao Wang. Published by Emerald Publishing Limited.
| Original language | English |
|---|---|
| Pages (from-to) | 518–546 |
| Number of pages | 29 |
| Journal | Engineering, Construction and Architectural Management |
| Volume | 32 |
| Issue number | 13 |
| Online published | 1 May 2025 |
| DOIs | |
| Publication status | Published - 15 Dec 2025 |
Funding
This study is financially supported by National Natural Science Foundation of China (No. 72201057) and Social Science Foundation of Jiangsu Province (No. 23GLC020).
Research Keywords
- Construction laws
- Knowledge repository
- Large language models
- Question-answering
Fingerprint
Dive into the research topics of 'Building a construction law knowledge repository to enhance general-purpose large language model performance on domain question-answering: a case of China'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver