Filter-INC : Handling effort-inconsistency in software effort estimation datasets

Research output: Chapters, Conference Papers, Creative and Literary Works (RGC: 12, 32, 41, 45)32_Refereed conference paper (with ISBN/ISSN)

View graph of relations

Author(s)

Related Research Unit(s)

Detail(s)

Original languageEnglish
Title of host publicationProceedings - Asia-Pacific Software Engineering Conference, APSEC
PublisherIEEE Computer Society
Pages185-192
ISBN (Print)9781509055753
Publication statusPublished - 30 Mar 2017

Publication series

Name
ISSN (Print)1530-1362

Conference

Title23rd Asia-Pacific Software Engineering Conference, APSEC 2016
PlaceNew Zealand
CityHamilton
Period6 - 9 December 2016

Abstract

Effort-inconsistency is a situation where historical software project data used for software effort estimation (SEE) are contaminated by many project cases with similar characteristics but are completed with significantly different amount of effort. Using these data for SEE generally produces inaccurate results; however, an effective technique for its handling is yet made to be available. This study approaches the problem differently from common solutions, where available techniques typically attempt to remove every project case they have detected as outliers. Instead, we hypothesize that data inconsistency is caused by only a few deviant project cases and any attempt to remove those other cases will result in reduced accuracy, largely due to loss of useful information and data diversity. Filter-INC (short for Filtering technique for handling effort-INConsistency in SEE datasets) implements the hypothesis to decide whether a project case being detected by any existing technique should be subject to removal. The evaluation is carried out by comparing the performance of 2 filtering techniques between before and after having Filter-INC applied. The results produced from 8 real-world datasets together with 3 machine-learning models, and evaluated by 4 performance measures show a significant accuracy improvement at the confident interval of 95%. Based on the results, we recommend our proposed hypothesis as an important instrument to design a data preprocessing technique for handling effort-inconsistency in SEE datasets, definitely an important step forward in preprocessing data for a more accurate SEE model.

Research Area(s)

  • Data preprocessing, Effort-inconsistency, Empirical software engineering, Software effort estimation

Citation Format(s)

Filter-INC : Handling effort-inconsistency in software effort estimation datasets. / Phannachitta, Passakorn; Keung, Jacky; Bennin, Kwabena Ebo; Monden, Akito; Matsumoto, Kenichi.

Proceedings - Asia-Pacific Software Engineering Conference, APSEC. IEEE Computer Society, 2017. p. 185-192 7890587.

Research output: Chapters, Conference Papers, Creative and Literary Works (RGC: 12, 32, 41, 45)32_Refereed conference paper (with ISBN/ISSN)