The Impact of the bug number on Effort-Aware Defect Prediction: An Empirical Study

Peixin Yang, Lin Zhu, Wenhua Hu*, Jacky Wai Keung, Liping Lu, Jianwen Xiang

*Corresponding author for this work

Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review

1 Citation (Scopus)

Abstract

Previous research have utilized public software defect datasets such as NASA, RELINK, and SOFTLAB, which only contain class label information. Almost all Effort-Aware Defect Prediction (EADP) studies are carried out around these datasets. However, EADP studies typically relying on bug density (i.e., the ratio between bug numbers and the lines of code) for ranking software modules. In order to investigate the impact of neglecting bug number information in software defect datasets on the performance of EADP models, we examine the performance degradation of the best-performing learning to rank methods when class labels are utilized instead of bug numbers. The experimental results show that neglecting bug number information in building EADP models results in an increase in the detected bugs. However, it also leads to a significant increase in the initial false alarms, ranging from 45.5% to 90.9% of the datasets, and an significant increase in the modules that need to be inspected, ranging from 5.2% to 70.4%. Therefore, we recommend not only the class labels but also the bug number information should be disclosed when publishing software defect datasets, in order to construct more accurate EADP models. © 2023 Copyright held by the owner/author(s). Publication rights licensed to ACM.
Original languageEnglish
Title of host publication14th Asia-Pacific Symposium on Internetware (Internetware 2023) - Proceedings
PublisherAssociation for Computing Machinery
Pages67-78
ISBN (Print)9798400708947
DOIs
Publication statusPublished - 2023
Event14th Asia-Pacific Symposium on Internetware (Internetware 2023) - Hangzhou, China
Duration: 4 Aug 20236 Aug 2023

Publication series

NameACM International Conference Proceeding Series

Conference

Conference14th Asia-Pacific Symposium on Internetware (Internetware 2023)
PlaceChina
CityHangzhou
Period4/08/236/08/23

Research Keywords

  • Bug Number
  • Effort-Aware
  • Learning to Rank
  • Software Defect Prediction

Fingerprint

Dive into the research topics of 'The Impact of the bug number on Effort-Aware Defect Prediction: An Empirical Study'. Together they form a unique fingerprint.

Cite this