TSTSS : A two-stage training subset selection framework for cross version defect prediction

Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review

29 Scopus Citations
View graph of relations

Author(s)

  • Zhou Xu
  • Shuai Li
  • Xiapu Luo
  • Jin Liu
  • Tao Zhang
  • Yutian Tang
  • Jun Xu
  • Peipei Yuan

Related Research Unit(s)

Detail(s)

Original languageEnglish
Pages (from-to)59-78
Journal / PublicationJournal of Systems and Software
Volume154
Online published23 Mar 2019
Publication statusPublished - Aug 2019

Abstract

Cross Version Defect Prediction (CVDP) is a practical scenario by training the classification model on the historical data of the prior version and then predicting the defect labels of modules in the current version. Unfortunately, the differences of data distribution across versions may hinder the effectiveness of the trained CVDP model. Thus, it is not trivial to select a suitable training subset from the prior version to promote the CVDP performance. In this paper, we propose a novel method, called Two-Stage Training Subset Selection (TSTSS), to address this challenging issue. In the first stage, TSTSS utilizes a sparse modeling representative selection method to select an initial module subset from the prior version which can well reconstruct the data of the prior version. In the second stage, TSTSS leverages a dissimilarity-based sparse subset selection method to further refine the selected module subset, which enables the selected modules to well represent the modules of the current version. Finally, we use a novel weighted extreme learning machine classifier to construct the CVDP model. We evaluate the CVDP performance of TSTSS on 50 cross-version pairs using 6 indicators. The experiments show that TSTSS can efficiently improve the CVDP performance compared with 11 baseline methods.

Research Area(s)

  • Cross version defect prediction, Spare modeling, Training subset selection, Weighted extreme learning machine

Citation Format(s)

TSTSS: A two-stage training subset selection framework for cross version defect prediction. / Xu, Zhou; Li, Shuai; Luo, Xiapu et al.
In: Journal of Systems and Software, Vol. 154, 08.2019, p. 59-78.

Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review