Utilizing cluster quality in hierarchical clustering for analogy-based software effort estimation

Research output: Chapters, Conference Papers, Creative and Literary Works (RGC: 12, 32, 41, 45)32_Refereed conference paper (with ISBN/ISSN)

1 Scopus Citations
View graph of relations

Author(s)

Related Research Unit(s)

Detail(s)

Original languageEnglish
Title of host publicationProceedings of 2017 IEEE 8th International Conference on Software Engineering and Service Science
EditorsWenzheng Li, Prasad Babu M. Surendra, Xiaohui Lei
PublisherIEEE
Pages1-4
ISBN (Electronic)978-1-5386-0496-0, 978-1-5386-0497-7
ISBN (Print)978-1-5386-4570-3
Publication statusPublished - Nov 2017

Conference

Title2017 8th IEEE International Conference on Software Engineering and Service Science (ICSESS 2017)
PlaceChina
CityBeijing
Period24 - 26 November 2017

Abstract

Analogy-based software effort estimation is one of the most popular estimation methods. It is built upon the principle of case-based reasoning (CBR) based on the k-th similar projects completed in the past. Therefore the determination of the k value is crucial to the prediction performance. Various research have been carried out to use a single and fixed k value for experiments, and it is known that dynamically allocated k values in an experiment will produce the optimized performance. This paper proposes an interesting technique based on hierarchical clustering to produce a range for k through various cluster quality criteria. We find that complete linkage clustering is more suitable for large datasets while single linkage clustering is suitable for small datasets. The method searches for optimized k values based on the proposed heuristic optimization technique, which have the advantages of easy computation and optimized for the dataset being investigated. Datasets from the PROMISE repository have been used to evaluate the proposed technique. The results of the experiments show that the proposed method is able to determine an optimized set of k values for analogy-based prediction, and to give estimates that outperformed traditional models based on a fixed k value. The implication is significant in that the analogy-based model will be optimized according the dataset being used, without the need to ask an expert to determining a single, fixed k value.

Research Area(s)

  • Analogy, Clustering, K-NN, Software Effort Estimation, Software Metrics and Measurements

Citation Format(s)

Utilizing cluster quality in hierarchical clustering for analogy-based software effort estimation. / Wu, Jack H. C.; Keung, Jacky W.

Proceedings of 2017 IEEE 8th International Conference on Software Engineering and Service Science . ed. / Wenzheng Li; Prasad Babu M. Surendra; Xiaohui Lei. IEEE, 2017. p. 1-4.

Research output: Chapters, Conference Papers, Creative and Literary Works (RGC: 12, 32, 41, 45)32_Refereed conference paper (with ISBN/ISSN)