Data Preparation for Deep Learning based Code Smell Detection : A Systematic Literature Review
Research output: Journal Publications and Reviews › RGC 21 - Publication in refereed journal › peer-review
Author(s)
Related Research Unit(s)
Detail(s)
Original language | English |
---|---|
Number of pages | 29 |
Journal / Publication | Journal of Systems and Software |
Online published | 12 Jun 2024 |
Publication status | Online published - 12 Jun 2024 |
Link(s)
Abstract
Code Smell Detection (CSD) plays a crucial role in improving software quality and maintainability. And Deep Learning (DL) techniques have emerged as a promising approach for CSD due to their superior performance. However, the effectiveness of DL-based CSD methods heavily relies on the quality of the training data. Despite its importance, little attention has been paid to analyzing the data preparation process.
This systematic literature review analyzes the data preparation techniques used in DL-based CSD methods. We identify 36 relevant papers published by December 2023 and provide a thorough analysis of the critical considerations in constructing CSD datasets, including data requirements, collection, labeling, and cleaning. We also summarize seven primary challenges and corresponding solutions in the literature.
Finally, we offer actionable recommendations for preparing and accessing high-quality CSD data, emphasizing the importance of data diversity, standardization, and accessibility. This survey provides valuable insights for researchers and practitioners to harness the full potential of DL techniques in CSD.
This systematic literature review analyzes the data preparation techniques used in DL-based CSD methods. We identify 36 relevant papers published by December 2023 and provide a thorough analysis of the critical considerations in constructing CSD datasets, including data requirements, collection, labeling, and cleaning. We also summarize seven primary challenges and corresponding solutions in the literature.
Finally, we offer actionable recommendations for preparing and accessing high-quality CSD data, emphasizing the importance of data diversity, standardization, and accessibility. This survey provides valuable insights for researchers and practitioners to harness the full potential of DL techniques in CSD.
Research Area(s)
- Code Smell Detection, Deep Learning, Data Preparation, Systematic Literature Review
Bibliographic Note
Information for this record is supplemented by the author(s) concerned.
Citation Format(s)
Data Preparation for Deep Learning based Code Smell Detection: A Systematic Literature Review. / Zhang, Fengji; Zhang, Zexian; Keung, Jacky Wai et al.
In: Journal of Systems and Software, 12.06.2024.
In: Journal of Systems and Software, 12.06.2024.
Research output: Journal Publications and Reviews › RGC 21 - Publication in refereed journal › peer-review