An empirical analysis of three-stage data-preprocessing for analogy-based software effort estimation on the ISBSG data

Research output: Chapters, Conference Papers, Creative and Literary Works (RGC: 12, 32, 41, 45)32_Refereed conference paper (with ISBN/ISSN)

9 Scopus Citations
View graph of relations

Author(s)

Related Research Unit(s)

Detail(s)

Original languageEnglish
Title of host publicationProceedings - 2017 IEEE International Conference on Software Quality, Reliability and Security, QRS 2017
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages442-449
ISBN (Print)9781538605929
Publication statusPublished - 28 Jul 2017

Conference

TitleThe 2017 IEEE International Conference on Software Quality, Reliability, and Security
Location Faculty of Information Technology, Czech Technical University
PlaceCzech Republic
CityPrague
Period25 - 29 July 2017

Abstract

Analogy-based software effort estimation is a method to estimate the project cost of an unseen project based on analogies against previous projects sharing selected features. The validity of the selected features depends on many factors, and one of most crucial factors is the effectiveness of the data-preprocessing techniques applied to the datasets of the previous projects. In this paper, we report the first controlled experiment that studies the class of three-stage data-preprocessing techniques with stages of missing data imputation, data normalization, and feature selection for analogy-based effort estimation. We conducted our investigation on the ISBSG data. The experimental results show that three-stage data-preprocessing techniques have significant impacts on the resultant effort estimation accuracy. The results also indicate that the combined use of Z-Score normalization, kNN imputation and mutual information based feature weighting can be an effective choice for analogy-based effort estimation.

Research Area(s)

  • Analogy-based effort estimation, Data normalization, Data-preprocessing, Feature selection, Missing data imputation

Citation Format(s)

An empirical analysis of three-stage data-preprocessing for analogy-based software effort estimation on the ISBSG data. / Huang, Jianglin; Li, Yan-Fu; Keung, Jacky Wai; Yu, Y. T.; Chan, W. K.

Proceedings - 2017 IEEE International Conference on Software Quality, Reliability and Security, QRS 2017. Institute of Electrical and Electronics Engineers Inc., 2017. p. 442-449 8009948.

Research output: Chapters, Conference Papers, Creative and Literary Works (RGC: 12, 32, 41, 45)32_Refereed conference paper (with ISBN/ISSN)