An Empirical Study of Learning to Rank Techniques for Effort-Aware Defect Prediction
Research output: Chapters, Conference Papers, Creative and Literary Works › RGC 32 - Refereed conference paper (with host publication) › peer-review
Author(s)
Related Research Unit(s)
Detail(s)
Original language | English |
---|---|
Title of host publication | SANER '19 - Proceedings of the 2019 IEEE 26th International Conference on Software Analysis, Evolution, and Reengineering |
Editors | Xinyu Wang, David Lo, Emad Shihab |
Publisher | Institute of Electrical and Electronics Engineers, Inc. |
Pages | 298-309 |
ISBN (electronic) | 9781728105918 |
ISBN (print) | 9781728105925 |
Publication status | Published - Feb 2019 |
Publication series
Name | Proceedings of the ... IEEE International Conference on Software Analysis, Evolution, and Reengineering |
---|---|
ISSN (Print) | 1534-5351 |
Conference
Title | 26th IEEE International Conference on Software Analysis, Evolution, and Reengineering (SANER 2019) |
---|---|
Location | Zhejiang University |
Place | China |
City | Hangzhou |
Period | 24 - 27 February 2019 |
Link(s)
Abstract
Effort-Aware Defect Prediction (EADP) ranks software modules based on the possibility of these modules being defective, their predicted number of defects, or defect density by using learning to rank algorithms. Prior empirical studies compared a few learning to rank algorithms considering small number of datasets, evaluating with inappropriate or one type of performance measure, and non-robust statistical test techniques. To address these concerns and investigate the impact of learning to rank algorithms on the performance of EADP models, we examine the practical effects of 23 learning to rank algorithms on 41 available defect datasets from the PROMISE repository using a module-based effort-Aware performance measure (FPA) and a source lines of code (SLOC) based effort-Aware performance measure (Norm(Popt)). In addition, we compare the performance of these algorithms when they are trained on a more relevant feature subset selected by the Information Gain feature selection method. In terms of FPA and Norm(Popt), statistically significant differences are observed among these algorithms with BRR (Bayesian Ridge Regression) performing best in terms of FPA, and BRR and LTR (Learning-To-Rank) performing best in terms of Norm(Popt). When these algorithms are trained on a more relevant feature subset selected by Information Gain, LTR and BRR still perform best with significant differences in terms of FPA and Norm(Popt). Therefore, we recommend BRR and LTR for building the EADP model in order to find more defects by inspecting a certain number of modules or lines of codes.
Research Area(s)
- effort-Aware defect prediction, empirical study, learning to rank, Scott-Knott ESD test
Citation Format(s)
An Empirical Study of Learning to Rank Techniques for Effort-Aware Defect Prediction. / Yu, Xiao; Bennin, Kwabena Ebo; Liu, Jin et al.
SANER '19 - Proceedings of the 2019 IEEE 26th International Conference on Software Analysis, Evolution, and Reengineering. ed. / Xinyu Wang; David Lo; Emad Shihab. Institute of Electrical and Electronics Engineers, Inc., 2019. p. 298-309 8668033 (Proceedings of the ... IEEE International Conference on Software Analysis, Evolution, and Reengineering).
SANER '19 - Proceedings of the 2019 IEEE 26th International Conference on Software Analysis, Evolution, and Reengineering. ed. / Xinyu Wang; David Lo; Emad Shihab. Institute of Electrical and Electronics Engineers, Inc., 2019. p. 298-309 8668033 (Proceedings of the ... IEEE International Conference on Software Analysis, Evolution, and Reengineering).
Research output: Chapters, Conference Papers, Creative and Literary Works › RGC 32 - Refereed conference paper (with host publication) › peer-review