An evolutionary multi-objective optimization framework of discretization-based feature selection for classification

Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review

62 Scopus Citations
View graph of relations

Author(s)

Related Research Unit(s)

Detail(s)

Original languageEnglish
Article number100770
Journal / PublicationSwarm and Evolutionary Computation
Volume60
Online published16 Sept 2020
Publication statusPublished - Feb 2021

Abstract

Feature selection (FS) aims to identify the most relevant and non-redundant feature subset for improving the classification accuracy, which is regarded as a NP-hard problem. Some heuristic methods, such as particle swarm optimization (PSO) have achieved great success, however, with the increase of feature quantity, the solution space is too large, resulting in lower search efficiency. Recent discretization-based FS methods map the search of feature domain into cut-point domain, which shrinks the solution space and improve the performances significantly. In this paper, considering the conflicts between different objectives, we proposed an evolutionary multi-objective optimization framework for discretization-based FS. To obtain the Pareto solutions, a flexible cut-point PSO (FCPSO) which can select an arbitrary number of cut-points for discretization is introduced to help better explore the relevant features. In FCPSO, a particle update and a novel adaptive mutation operator are alternatively used to effectively find the relevant features and remove the redundant features. At last, to select the best feature subset, a Pareto ensemble method is designed to generate a number of feasible solutions based on Pareto set followed by a hierarchical solution selection process. We implemented the proposed framework by using three representative multi-objective evolutionary algorithms and compared them with some state-of-the-art methods. Experimental results on ten benchmark microarray gene datasets demonstrate that our proposed framework significantly outperforms other methods in terms of test classification accuracy with a competitive size of feature subset.

Research Area(s)

  • Discretization, Evolutionary multi-objective algorithms, Feature selection, Pareto ensemble, Particle swarm optimization