An Empirical Study of Learning to Rank Techniques for Effort-Aware Defect Prediction

Research output: Chapters, Conference Papers, Creative and Literary Works (RGC: 12, 32, 41, 45)32_Refereed conference paper (with ISBN/ISSN)Not applicablepeer-review

View graph of relations

Author(s)

  • Xiao Yu
  • Jin Liu
  • Xiaofei Yin
  • Zhou Xu

Related Research Unit(s)

Detail(s)

Original languageEnglish
Title of host publicationSANER '19 - Proceedings of the 2019 IEEE 26th International Conference on Software Analysis, Evolution, and Reengineering
EditorsXinyu Wang, David Lo, Emad Shihab
PublisherIEEE
Pages298-309
ISBN (Electronic)9781728105918
ISBN (Print)9781728105925
Publication statusPublished - Feb 2019

Publication series

NameProceedings of the ... IEEE International Conference on Software Analysis, Evolution, and Reengineering
ISSN (Print)1534-5351

Conference

Title26th IEEE International Conference on Software Analysis, Evolution, and Reengineering (SANER 2019)
LocationZhejiang University
PlaceChina
CityHangzhou
Period24 - 27 February 2019

Abstract

Effort-Aware Defect Prediction (EADP) ranks software modules based on the possibility of these modules being defective, their predicted number of defects, or defect density by using learning to rank algorithms. Prior empirical studies compared a few learning to rank algorithms considering small number of datasets, evaluating with inappropriate or one type of performance measure, and non-robust statistical test techniques. To address these concerns and investigate the impact of learning to rank algorithms on the performance of EADP models, we examine the practical effects of 23 learning to rank algorithms on 41 available defect datasets from the PROMISE repository using a module-based effort-Aware performance measure (FPA) and a source lines of code (SLOC) based effort-Aware performance measure (Norm(Popt)). In addition, we compare the performance of these algorithms when they are trained on a more relevant feature subset selected by the Information Gain feature selection method. In terms of FPA and Norm(Popt), statistically significant differences are observed among these algorithms with BRR (Bayesian Ridge Regression) performing best in terms of FPA, and BRR and LTR (Learning-To-Rank) performing best in terms of Norm(Popt). When these algorithms are trained on a more relevant feature subset selected by Information Gain, LTR and BRR still perform best with significant differences in terms of FPA and Norm(Popt). Therefore, we recommend BRR and LTR for building the EADP model in order to find more defects by inspecting a certain number of modules or lines of codes.

Research Area(s)

  • effort-Aware defect prediction, empirical study, learning to rank, Scott-Knott ESD test

Citation Format(s)

An Empirical Study of Learning to Rank Techniques for Effort-Aware Defect Prediction. / Yu, Xiao; Bennin, Kwabena Ebo; Liu, Jin; Keung, Jacky Wai; Yin, Xiaofei; Xu, Zhou.

SANER '19 - Proceedings of the 2019 IEEE 26th International Conference on Software Analysis, Evolution, and Reengineering. ed. / Xinyu Wang; David Lo; Emad Shihab. IEEE, 2019. p. 298-309 8668033 (Proceedings of the ... IEEE International Conference on Software Analysis, Evolution, and Reengineering).

Research output: Chapters, Conference Papers, Creative and Literary Works (RGC: 12, 32, 41, 45)32_Refereed conference paper (with ISBN/ISSN)Not applicablepeer-review