Abstract
Cross-company defect prediction (CCDP) is a practical way that trains a prediction model by exploiting one or multiple projects of a source company and then applies the model to target company. Unfortunately, the performance of such CCDP models is susceptible to the high imbalanced nature between the defect-prone and non-defect classes of CC data. Class imbalance learning is applied to alleviate this issue. Because many class imbalance learning methods have been proposed, there is an imperative need to analyze and compare the performance of these methods for CCDP. Although prior empirical studies have proven AdaBoost.NC algorithm achieves the best performance for defect prediction. This observation leads us to conduct a careful empirical study the issues of if and how class imbalance learning methods can benefit cross-company defect prediction. We investigate different types of class imbalance learning methods, including under-sampling technique, over-sampling technique and over sampling followed by under-sampling technique on the cross-company defect prediction performance over 15 publicly available datasets. Experimental results show that under-sampling technique achieves the best overall performance in terms of the gmeasure among those methods we studied.
Original language | English |
---|---|
Title of host publication | SEKE 2017 |
Subtitle of host publication | Proceedings of the 29th International Conference on Software Engineering & Knowledge Engineering |
Publisher | KSI Research Inc. and Knowledge Systems Institute |
Pages | 117-122 |
ISBN (Print) | 1891706411 |
DOIs | |
Publication status | Published - 5 Jul 2017 |
Event | 29th International Conference on Software Engineering and Knowledge Engineering, SEKE 2017 - Wyndham Pittsburgh University Center, Pittsburgh, United States Duration: 5 Jul 2017 → 7 Jul 2017 http://ksiresearchorg.ipage.com/seke/seke17.html http://ksiresearchorg.ipage.com/seke/Proceedings/seke/SEKE2017_Proceedings.pdf |
Publication series
Name | |
---|---|
ISSN (Print) | 2325-9000 |
ISSN (Electronic) | 2325-9086 |
Conference
Conference | 29th International Conference on Software Engineering and Knowledge Engineering, SEKE 2017 |
---|---|
Country/Territory | United States |
City | Pittsburgh |
Period | 5/07/17 → 7/07/17 |
Internet address |
Research Keywords
- Class imbalance learning
- Cross-company defect prediction
- Software defect prediction