Skip to main navigation Skip to search Skip to main content

An empirical analysis of three-stage data-preprocessing for analogy-based software effort estimation on the ISBSG data

Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review

Abstract

Analogy-based software effort estimation is a method to estimate the project cost of an unseen project based on analogies against previous projects sharing selected features. The validity of the selected features depends on many factors, and one of most crucial factors is the effectiveness of the data-preprocessing techniques applied to the datasets of the previous projects. In this paper, we report the first controlled experiment that studies the class of three-stage data-preprocessing techniques with stages of missing data imputation, data normalization, and feature selection for analogy-based effort estimation. We conducted our investigation on the ISBSG data. The experimental results show that three-stage data-preprocessing techniques have significant impacts on the resultant effort estimation accuracy. The results also indicate that the combined use of Z-Score normalization, kNN imputation and mutual information based feature weighting can be an effective choice for analogy-based effort estimation.
Original languageEnglish
Title of host publicationProceedings - 2017 IEEE International Conference on Software Quality, Reliability and Security, QRS 2017
PublisherIEEE
Pages442-449
ISBN (Print)9781538605929
DOIs
Publication statusPublished - 28 Jul 2017
EventThe 2017 IEEE International Conference on Software Quality, Reliability, and Security - Faculty of Information Technology, Czech Technical University, Prague, Czech Republic
Duration: 25 Jul 201729 Jul 2017
Conference number: 17th
http://paris.utdallas.edu/qrs17
http://paris.utdallas.edu/qrs17/
http://paris.utdallas.edu/qrs17/
http://paris.utdallas.edu/qrs17/program/QRS-2017-Program.pdf

Conference

ConferenceThe 2017 IEEE International Conference on Software Quality, Reliability, and Security
Abbreviated titleQRS 2017
PlaceCzech Republic
CityPrague
Period25/07/1729/07/17
Internet address

Research Keywords

  • Analogy-based effort estimation
  • Data normalization
  • Data-preprocessing
  • Feature selection
  • Missing data imputation

Fingerprint

Dive into the research topics of 'An empirical analysis of three-stage data-preprocessing for analogy-based software effort estimation on the ISBSG data'. Together they form a unique fingerprint.

Cite this