A Study of Traditional and AI-driven Approaches for Agile Requirements Engineering
關於傳統以及人工智能驅動方法在敏捷需求工程中的研究
Student thesis: Doctoral Thesis
Author(s)
Related Research Unit(s)
Detail(s)
Awarding Institution | |
---|---|
Supervisors/Advisors |
|
Award date | 30 Aug 2024 |
Link(s)
Permanent Link | https://scholars.cityu.edu.hk/en/theses/theses(d3718285-81c2-4d8a-951f-a7bfb5e1ce3a).html |
---|---|
Other link(s) | Links |
Abstract
In recent decades, the agile software development method has seen extensive adoption across companies of various sizes, serving as a cornerstone in the software development process. Requirements Engineering (RE) stands pivotal in this process, providing the foundation for effort management, architectural design, development framework construction, and testing strategy definition. Understanding the challenges inherent in agile RE within real-world contexts is crucial for developing advanced RE techniques tailored to the complexities of agile development.
This thesis presents two case studies conducted on industry-academia projects to explore the challenges encountered in agile RE within real-world contexts. Through action research methodology, traditional RE techniques were applied to address identified challenges. Furthermore, the thesis proposes leveraging Large Language Models (LLMs) to automate certain agile RE tasks, such as requirements elicitation and analysis, to reduce human effort in agile development.
Unprecedented global challenges, exemplified by the COVID-19 pandemic, have necessitated widespread adoption of Work-From-Home (WFH) arrangements, presenting formidable obstacles in conveying requirements within agile RE frameworks. Given the expected continuation of WFH practices post-COVID, identifying and addressing these challenges is paramount. Amid the pandemic, we engaged in an industry-academia collaboration, employing action research methodology to comprehensively analyze agile RE practices within the WFH context. Through our study, key challenges were identified, and collaborative RE techniques were proposed and iteratively refined through three intervention cycles, complemented by in-depth interviews for thorough analysis. Additionally, insights gleaned from this thesis provide valuable guidance on effective collaborative RE techniques and lessons learned.
Machine Learning (ML) has emerged as a core technology across various domains, garnering substantial attention in Software Engineering (SE). However, the complexity inherent in developing ML applications introduces additional challenges to RE activities. Addressing concerns raised by RE researchers, this thesis aims to observe RE activities for ML-enabled systems in real-world contexts, focusing on an ML-enabled FinTech project. By engaging data scientists to clarify ML-related requirements, our study explores RE difficulties from both the data scientist and requirement engineer perspectives. Additionally, a proposed RE framework iteratively adapts to data and model relevance in ML-enabled FinTech application development, facilitating the completion of RE activities tailored to ML-specific requirements.
Within agile RE frameworks, Generating Acceptance Criteria (GAC) to elaborate user stories assumes critical importance during sprint planning. However, the lack of labeled datasets tailored specifically to User Stories associated with Acceptance Criteria (US-AC) poses significant challenges for supervised learning techniques that attempt to automate this process. Recent advancements in LLM technology offer potential solutions for automating the process, motivating the proposal of an automated framework named SimAC. SimAC leverages LLMs to simulate collaborative practices within an agile environment, facilitating improved GAC generation. Owing to the unavailability of ground truths, we invited practitioners to build a gold standard serving as a benchmark to evaluate the completeness and validity of auto-generated US-AC against human-crafted ones. Additionally, we invited eight experienced agile practitioners to evaluate the quality of US-AC using the INVEST framework. The results demonstrate consistent improvements across all tested LLMs, including the LLaMA and GPT-3.5 series. Furthermore, our study also provides comprehensive case studies to illustrate SimAC's effectiveness and limitations, shedding light on the potential of LLMs in automated agile RE.
Finally, for further analyzing the requirements, automating the derivation of conceptual models, such as class diagrams, from user stories holds promise in streamlining agile projects. Our study explores the potential of LLMs and a Chain-of-Thought (CoT) prompting technique, namely CoTUML, tailored to automate this task. Through a comprehensive preliminary study, we compare LLM-based approaches with guided human extraction, demonstrating the superiority of LLM-based approaches, particularly when combined with well-crafted few-shot prompts. CoTUML further enhances the comprehensiveness of relationship identification. Additionally, qualitative analyses shed light on areas of suboptimal performance, offering avenues for improvement and guiding future research in requirement analysis automation.
This thesis presents two case studies conducted on industry-academia projects to explore the challenges encountered in agile RE within real-world contexts. Through action research methodology, traditional RE techniques were applied to address identified challenges. Furthermore, the thesis proposes leveraging Large Language Models (LLMs) to automate certain agile RE tasks, such as requirements elicitation and analysis, to reduce human effort in agile development.
Unprecedented global challenges, exemplified by the COVID-19 pandemic, have necessitated widespread adoption of Work-From-Home (WFH) arrangements, presenting formidable obstacles in conveying requirements within agile RE frameworks. Given the expected continuation of WFH practices post-COVID, identifying and addressing these challenges is paramount. Amid the pandemic, we engaged in an industry-academia collaboration, employing action research methodology to comprehensively analyze agile RE practices within the WFH context. Through our study, key challenges were identified, and collaborative RE techniques were proposed and iteratively refined through three intervention cycles, complemented by in-depth interviews for thorough analysis. Additionally, insights gleaned from this thesis provide valuable guidance on effective collaborative RE techniques and lessons learned.
Machine Learning (ML) has emerged as a core technology across various domains, garnering substantial attention in Software Engineering (SE). However, the complexity inherent in developing ML applications introduces additional challenges to RE activities. Addressing concerns raised by RE researchers, this thesis aims to observe RE activities for ML-enabled systems in real-world contexts, focusing on an ML-enabled FinTech project. By engaging data scientists to clarify ML-related requirements, our study explores RE difficulties from both the data scientist and requirement engineer perspectives. Additionally, a proposed RE framework iteratively adapts to data and model relevance in ML-enabled FinTech application development, facilitating the completion of RE activities tailored to ML-specific requirements.
Within agile RE frameworks, Generating Acceptance Criteria (GAC) to elaborate user stories assumes critical importance during sprint planning. However, the lack of labeled datasets tailored specifically to User Stories associated with Acceptance Criteria (US-AC) poses significant challenges for supervised learning techniques that attempt to automate this process. Recent advancements in LLM technology offer potential solutions for automating the process, motivating the proposal of an automated framework named SimAC. SimAC leverages LLMs to simulate collaborative practices within an agile environment, facilitating improved GAC generation. Owing to the unavailability of ground truths, we invited practitioners to build a gold standard serving as a benchmark to evaluate the completeness and validity of auto-generated US-AC against human-crafted ones. Additionally, we invited eight experienced agile practitioners to evaluate the quality of US-AC using the INVEST framework. The results demonstrate consistent improvements across all tested LLMs, including the LLaMA and GPT-3.5 series. Furthermore, our study also provides comprehensive case studies to illustrate SimAC's effectiveness and limitations, shedding light on the potential of LLMs in automated agile RE.
Finally, for further analyzing the requirements, automating the derivation of conceptual models, such as class diagrams, from user stories holds promise in streamlining agile projects. Our study explores the potential of LLMs and a Chain-of-Thought (CoT) prompting technique, namely CoTUML, tailored to automate this task. Through a comprehensive preliminary study, we compare LLM-based approaches with guided human extraction, demonstrating the superiority of LLM-based approaches, particularly when combined with well-crafted few-shot prompts. CoTUML further enhances the comprehensiveness of relationship identification. Additionally, qualitative analyses shed light on areas of suboptimal performance, offering avenues for improvement and guiding future research in requirement analysis automation.
- Agile Requirements Engineering, Action Research, Large Language Models, Prompt Engineering