Advancing text classification: a novel two-stage multi-objective feature selection framework

Yan Liu, Xian Cheng, Liao Shaoyi Stephen, Shansen Wei*

*Corresponding author for this work

Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review

Abstract

In the realm of text classification, feature selection stands as a pivotal element, focusing on the identification of relevant terms through filter indicators or accuracy measures. Given the plethora of available indicators and measures, the diverse information they unveil leads to disparate feature selection outcomes. This paper presents a novel two-stage multi-objective feature selection framework that encompasses multiple filter indicators and accuracy measures in both the filter and wrapper stages. Employing Data Envelopment Analysis (DEA), the framework addresses the multi-objective decision-making challenge by exploring the Pareto efficient frontier. To comprehensively assess the framework's efficacy, experiments were conducted on twelve datasets using six distinct Classification Algorithms. The results highlight the superiority of the DEA Filter-Wrapper model (DEAFW), constructed based on this innovative framework. DEAFW consistently outperformed five single-objective filter models and a one-stage multi-objective filter model across six performance metrics in the majority of cases. For instance, in the case of logistic regression, DEAFW achieved the highest average rank among twelve datasets across all performance metrics. Furthermore, a comparative analysis with four existing feature selection techniques affirmed the consistent superiority of the DEAFW model, as it consistently attained the smallest grand average rank value across twelve datasets for most performance metrics. © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2025.
Original languageEnglish
Article number107057
JournalInformation Technology and Management
Online published13 Apr 2025
DOIs
Publication statusOnline published - 13 Apr 2025

Research Keywords

  • Data envelopment analysis
  • Feature selection
  • Multi-objective decision making
  • Text classification

Fingerprint

Dive into the research topics of 'Advancing text classification: a novel two-stage multi-objective feature selection framework'. Together they form a unique fingerprint.

Cite this