Projects per year
Abstract
Feature selection is a significant step before a classification task used to reduce excessive computational costs and enhance classification performance. This paper illustrates a novel feature selection method based on the concept of utility that is grounded in economics theory. In particular, we focus on a utility-based feature selection method for enhancing text classification. Different from existing feature selection methods, the proposed method selects discriminative semantic terms according to how authors utilize terms to express the main ideas in textual documents, i.e., the “utility of terms,” a criteria that can be used to measure the usefulness of terms on expressing authors’ main ideas. To our best knowledge, our work represents the successful research on the leveraging economics theory for developing a semantically rich feature selection method to improve text classification. Our empirical tests based on six UCI benchmark datasets confirm that the proposed method often outperforms other state-of-the-art feature selection methods in text classification. Moreover, our method provides an economics explanation of term weighting for information retrieval and semantic information acquisition in textual documents.
| Original language | English |
|---|---|
| Pages (from-to) | 197–226 |
| Journal | Knowledge and Information Systems |
| Volume | 61 |
| Issue number | 1 |
| Online published | 8 Dec 2018 |
| DOIs | |
| Publication status | Published - Oct 2019 |
Research Keywords
- Economics theory
- Feature selection
- Text classification
- Text mining
- Utility theory
RGC Funding Information
- RGC-funded
Fingerprint
Dive into the research topics of 'Utility-based feature selection for text classification'. Together they form a unique fingerprint.Projects
- 2 Finished
-
GRF: BigCredit: A Novel Framework for Big Social Media Data Enhanced Online Credit Scoring
LAU, Y. K. R. (Principal Investigator / Project Coordinator), Li, C. (Co-Investigator) & WONG, C. S. M. (Co-Investigator)
1/01/17 → 3/06/21
Project: Research
-
GRF: Big Data Analytics for Detecting Deceptive Product Comments in Online Social Media
LAU, Y. K. R. (Principal Investigator / Project Coordinator) & Li, C. (Co-Investigator)
1/01/16 → 24/06/20
Project: Research