Inter-release defect prediction with feature selection using temporal chunk-based learning : An empirical study
Related Research Unit(s)
|Journal / Publication||Applied Soft Computing|
|Issue number||Part A|
|Online published||9 Sep 2021|
|Publication status||Published - Dec 2021|
|Link to Scopus||https://www.scopus.com/record/display.uri?eid=2-s2.0-85115035790&origin=recordpage|
Inter-release defect prediction (IRDP) is a practical scenario that employs the datasets of the previous release to build a prediction model and predicts defects for the current release within the same software project. A practical software project experiences several releases where data of each release appears in the form of chunks that arrive in temporal order. The evolving data of each release introduces new concept to the model known as concept drift, which negatively impacts the performance of IRDP models. In this study, we aim to examine and assess the impact of feature selection (FS) on the performance of IRDP models and the robustness of the model to concept drift. We conduct empirical experiments using 36 releases of 10 open-source projects. The Friedman and Nemenyi Post-hoc test results indicate that there were statistical differences between the prediction results with and without FS techniques. IRDP models trained on the data of most recent releases were not always the best models. Furthermore, the prediction models trained with carefully selected features could help reduce concept drifts.
- Software defect prediction, Inter-release defect prediction, Feature selection
Applied Soft Computing, Vol. 113, No. Part A, 107870, 12.2021.
Research output: Journal Publications and Reviews (RGC: 21, 22, 62) › 21_Publication in refereed journal › peer-review
Kabir MA, Keung J, Turhan B, Bennin KE. Inter-release defect prediction with feature selection using temporal chunk-based learning: An empirical study. Applied Soft Computing. 2021 Dec;113(Part A):107870. Epub 2021 Sep 9. doi: 10.1016/j.asoc.2021.107870