Hybrid Incremental Ensemble Learning for Noisy Real-World Data Classification

Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review

42 Scopus Citations
View graph of relations

Author(s)

  • Zhiwen Yu
  • Daxing Wang
  • Zhuoxiong Zhao
  • C. L. Philip Chen
  • Jane You
  • Jun Zhang

Related Research Unit(s)

Detail(s)

Original languageEnglish
Pages (from-to)403-416
Journal / PublicationIEEE Transactions on Cybernetics
Volume49
Issue number2
Online published4 Dec 2017
Publication statusPublished - Feb 2019

Abstract

Traditional ensemble learning approaches explore the feature space and the sample space, respectively, which will prevent them to construct more powerful learning models for noisy real-world dataset classification. The random subspace method only search for the selection of features. Meanwhile, the bagging approach only search for the selection of samples. To overcome these limitations, we propose the hybrid incremental ensemble learning (HIEL) approach which takes into consideration the feature space and the sample space simultaneously to handle noisy dataset. Specifically, HIEL first adopts the bagging technique and linear discriminant analysis to remove noisy attributes, and generates a set of bootstraps and the corresponding ensemble members in the subspaces. Then, the classifiers are selected incrementally based on a classifier-specific criterion function and an ensemble criterion function. The corresponding weights for the classifiers are assigned during the same process. Finally, the final label is summarized by a weighted voting scheme, which serves as the final result of the classification. We also explore various classifier-specific criterion functions based on different newly proposed similarity measures, which will alleviate the effect of noisy samples on the distance functions. In addition, the computational cost of HIEL is analyzed theoretically. A set of nonparametric tests are adopted to compare HIEL and other algorithms over several datasets. The experiment results show that HIEL performs well on the noisy datasets. HIEL outperforms most of the compared classifier ensemble methods on 14 out of 24 noisy real-world UCI and KEEL datasets.

Research Area(s)

  • Algorithm design and analysis, Bagging, classification, classifier ensemble, ensemble learning, Linear discriminant analysis, linear discriminant analysis (LDA), Noise measurement, Proteins, Space exploration, Training

Citation Format(s)

Hybrid Incremental Ensemble Learning for Noisy Real-World Data Classification. / Yu, Zhiwen; Wang, Daxing; Zhao, Zhuoxiong et al.
In: IEEE Transactions on Cybernetics, Vol. 49, No. 2, 02.2019, p. 403-416.

Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review