Exploiting Diversity in Ensembles for Software Effort Estimation
DescriptionSoftware Effort Estimation (SEE) in Software Engineering is an important process of predicting the most realistic amount of efforts required to develop the target software application based on limited information in requirements, and many other uncertain project inputs and constraints. Being able to correctly estimating the effort required at the early stage of a software development project helps project managers tremendously to budget and plan for the entire project, is of vital importance to the successful outcome of a software development project. Research progress of ensemble learning for SEE is still at its infant stage. The model composition of the ensemble methods for SEE is mainly based on the best-performed solo-models and its extraction is determined by their model ranking variations i.e. stability. The top-ranked models segment is first extracted by an expert, and to use models from the selected segment to build ensembles using simple combination schemes. This can be viewed as a bootstrap aggregation approach in ensemble learning as all of the selected models are weighted equally and only top learners are being used. However, the weak models cannot contribute to the ensemble methods at all. The novelty of this project enables the introduction of more diversified SEE ensembles for different datasets, and to provide a more comprehensive SEE ensemble estimation framework. Towards the goal, the project attempts to address two main challenges. Firstly, Method Extraction and Classification - identify solo-methods from a large pool of options which are providing higher-ranking stability and therefore superior, and also localize different segments containing useful solo-methods to provide better model diversity in ensembles building. Secondly, Robust Combination Schemes - explore and provide empirical evidences on how different robust combination schemes (such as Bagging, Boosting, Random Forests) should be adopted to build better ensembles. These challenges to be addressed are of great research values and practical implications to software effort estimation and software engineering.
|Effective start/end date||1/09/14 → 1/09/16|