Hashing-Based Undersampling Ensemble for Imbalanced Pattern Classification Problems

Research output: Journal Publications and Reviews (RGC: 21, 22, 62)21_Publication in refereed journalpeer-review

14 Scopus Citations
View graph of relations

Author(s)

  • Wing W. Y. Ng
  • Shichao Xu
  • Jianjun Zhang
  • Xing Tian
  • Tongwen Rong

Related Research Unit(s)

Detail(s)

Original languageEnglish
Pages (from-to)1269-1279
Journal / PublicationIEEE Transactions on Cybernetics
Volume52
Issue number2
Online published29 Jun 2020
Publication statusPublished - Feb 2022

Abstract

Undersampling is a popular method to solve imbalanced classification problems. However, sometimes it may remove too many majority samples which may lead to loss of informative samples. In this article, the hashing-based undersampling ensemble (HUE) is proposed to deal with this problem by constructing diversified training subspaces for undersampling. Samples in the majority class are divided into many subspaces by a hashing method. Each subspace corresponds to a training subset which consists of most of the samples from this subspace and a few samples from surrounding subspaces. These training subsets are used to train an ensemble of classification and regression tree classifiers with all minority class samples. The proposed method is tested on 25 UCI datasets against state-of-the-art methods. Experimental results show that the HUE outperforms other methods and yields good results on highly imbalanced datasets.

Research Area(s)

  • Bagging, hashing, imbalanced classification problems, undersampling

Citation Format(s)

Hashing-Based Undersampling Ensemble for Imbalanced Pattern Classification Problems. / Ng, Wing W. Y.; Xu, Shichao; Zhang, Jianjun et al.
In: IEEE Transactions on Cybernetics, Vol. 52, No. 2, 02.2022, p. 1269-1279.

Research output: Journal Publications and Reviews (RGC: 21, 22, 62)21_Publication in refereed journalpeer-review