Abstract
Analogy-based software effort estimation is a method to estimate the project cost of an unseen project based on analogies against previous projects sharing selected features. The validity of the selected features depends on many factors, and one of most crucial factors is the effectiveness of the data-preprocessing techniques applied to the datasets of the previous projects. In this paper, we report the first controlled experiment that studies the class of three-stage data-preprocessing techniques with stages of missing data imputation, data normalization, and feature selection for analogy-based effort estimation. We conducted our investigation on the ISBSG data. The experimental results show that three-stage data-preprocessing techniques have significant impacts on the resultant effort estimation accuracy. The results also indicate that the combined use of Z-Score normalization, kNN imputation and mutual information based feature weighting can be an effective choice for analogy-based effort estimation.
| Original language | English |
|---|---|
| Title of host publication | Proceedings - 2017 IEEE International Conference on Software Quality, Reliability and Security, QRS 2017 |
| Publisher | IEEE |
| Pages | 442-449 |
| ISBN (Print) | 9781538605929 |
| DOIs | |
| Publication status | Published - 28 Jul 2017 |
| Event | The 2017 IEEE International Conference on Software Quality, Reliability, and Security - Faculty of Information Technology, Czech Technical University, Prague, Czech Republic Duration: 25 Jul 2017 → 29 Jul 2017 Conference number: 17th http://paris.utdallas.edu/qrs17 http://paris.utdallas.edu/qrs17/ http://paris.utdallas.edu/qrs17/ http://paris.utdallas.edu/qrs17/program/QRS-2017-Program.pdf |
Conference
| Conference | The 2017 IEEE International Conference on Software Quality, Reliability, and Security |
|---|---|
| Abbreviated title | QRS 2017 |
| Place | Czech Republic |
| City | Prague |
| Period | 25/07/17 → 29/07/17 |
| Internet address |
Research Keywords
- Analogy-based effort estimation
- Data normalization
- Data-preprocessing
- Feature selection
- Missing data imputation
Fingerprint
Dive into the research topics of 'An empirical analysis of three-stage data-preprocessing for analogy-based software effort estimation on the ISBSG data'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver