The impact of class imbalance techniques on crashing fault residence prediction models

Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review

3 Scopus Citations
View graph of relations

Author(s)

  • Kunsong Zhao
  • Zhou Xu
  • Meng Yan
  • Tao Zhang
  • Lei Xue
  • Ming Fan

Related Research Unit(s)

Detail(s)

Original languageEnglish
Article number49
Journal / PublicationEmpirical Software Engineering
Volume28
Issue number2
Online published22 Feb 2023
Publication statusPublished - Mar 2023

Abstract

Software crashes occur when the software program is executed wrongly or interrupted compulsively, which negatively impacts on user experience. Since the stack traces offer the exception-related information about software crashes, researchers used features collected from the stack trace to automatically identify whether the fault residence where the crash occurred is in the stack trace, aiming at accelerating the process of crash localization. A recent work conducted the first large-scale empirical study, which investigated the impact of feature selection methods on the performance of classification models for this task. However, the crash data have the intrinsic class imbalance characteristic, i.e., there exists a large difference between the number of crash instances inside and outside the stack trace, which is ignored by the previous work. To fill this gap, in this work, we conduct a large-scale empirical study to explore how different imbalanced learning techniques impact the performance of crashing fault residence prediction models on a benchmark dataset comprising seven software projects with four evaluation indicators. Our experimental results demonstrate that two imbalanced variants of the bagging classifier perform better than other compared techniques in both the normal and cross-project settings, and can constantly generate excellent prediction performance even though the imbalance level changes. © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2023.

Research Area(s)

  • Crash localization, Empirical study, Imbalanced learning, Stack trace

Citation Format(s)

The impact of class imbalance techniques on crashing fault residence prediction models. / Zhao, Kunsong; Xu, Zhou; Yan, Meng et al.
In: Empirical Software Engineering, Vol. 28, No. 2, 49, 03.2023.

Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review