The Impact of the bug number on Effort-Aware Defect Prediction : An Empirical Study

Research output: Chapters, Conference Papers, Creative and Literary Works (RGC: 12, 32, 41, 45)32_Refereed conference paper (with host publication)peer-review

View graph of relations

Author(s)

  • Peixin Yang
  • Lin Zhu
  • Wenhua Hu
  • Liping Lu
  • Jianwen Xiang

Related Research Unit(s)

Detail(s)

Original languageEnglish
Title of host publication14th Asia-Pacific Symposium on Internetware (Internetware 2023) - Proceedings
PublisherAssociation for Computing Machinery
Pages67-78
ISBN (Print)9798400708947
Publication statusPublished - 2023

Publication series

NameACM International Conference Proceeding Series

Conference

Title14th Asia-Pacific Symposium on Internetware (Internetware 2023)
PlaceChina
CityHangzhou
Period4 - 6 August 2023

Abstract

Previous research have utilized public software defect datasets such as NASA, RELINK, and SOFTLAB, which only contain class label information. Almost all Effort-Aware Defect Prediction (EADP) studies are carried out around these datasets. However, EADP studies typically relying on bug density (i.e., the ratio between bug numbers and the lines of code) for ranking software modules. In order to investigate the impact of neglecting bug number information in software defect datasets on the performance of EADP models, we examine the performance degradation of the best-performing learning to rank methods when class labels are utilized instead of bug numbers. The experimental results show that neglecting bug number information in building EADP models results in an increase in the detected bugs. However, it also leads to a significant increase in the initial false alarms, ranging from 45.5% to 90.9% of the datasets, and an significant increase in the modules that need to be inspected, ranging from 5.2% to 70.4%. Therefore, we recommend not only the class labels but also the bug number information should be disclosed when publishing software defect datasets, in order to construct more accurate EADP models. © 2023 Copyright held by the owner/author(s). Publication rights licensed to ACM.

Research Area(s)

  • Bug Number, Effort-Aware, Learning to Rank, Software Defect Prediction

Citation Format(s)

The Impact of the bug number on Effort-Aware Defect Prediction: An Empirical Study. / Yang, Peixin; Zhu, Lin; Hu, Wenhua et al.
14th Asia-Pacific Symposium on Internetware (Internetware 2023) - Proceedings. Association for Computing Machinery, 2023. p. 67-78 (ACM International Conference Proceeding Series).

Research output: Chapters, Conference Papers, Creative and Literary Works (RGC: 12, 32, 41, 45)32_Refereed conference paper (with host publication)peer-review