Evolving pathway activation from cancer gene expression data using nature-inspired ensemble optimization[Formula presented]

Xubin Wang, Yunhe Wang*, Zhiqiang Ma, Ka-Chun Wong, Xiangtao Li

*Corresponding author for this work

Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review

1 Citation (Scopus)

Abstract

Class-imbalanced biological datasets pose significant challenges in machine learning and data analysis tasks. Prior methods to handle imbalance rely on data oversampling, which increases computational costs and overfitting. While feature selection and ensemble learning are promising techniques, current applications in imbalanced contexts are limited. To address these challenges, we present a novel framework called Hybrid Sampling Nature-Inspired Optimization Ensemble (HSNOE) to enhance the identification of hidden responders in imbalanced biological datasets. Our contributions are three-fold: 1) A hybrid undersampling and oversampling technique to mitigate class-imbalance; 2) Integrate an ant colony optimization-based feature selection that identifies informative feature subsets; 3) An ensemble classifier integrating diverse models trained on optimized features to improve performance. The experiments conducted on the five biological datasets demonstrate that HSNOE exhibits more stable comprehensive performance across six evaluation metrics compared to ten benchmark methods. We also conducted a biological analysis specifically on the Pan-cancer dataset. Moreover, the HSNOE method has been made publicly available on GitHub.1 © 2024 Elsevier Ltd.
Original languageEnglish
Article number123469
JournalExpert Systems with Applications
Volume248
Online published12 Feb 2024
DOIs
Publication statusPublished - 15 Aug 2024

Research Keywords

  • Ant colony optimization
  • Class-imbalanced learning
  • Ensemble learning
  • Feature selection
  • Sampling

Fingerprint

Dive into the research topics of 'Evolving pathway activation from cancer gene expression data using nature-inspired ensemble optimization[Formula presented]'. Together they form a unique fingerprint.

Cite this