Topic-Based Instance and Feature Selection in Multilabel Classification

Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review

18 Scopus Citations
View graph of relations

Author(s)

Related Research Unit(s)

Detail(s)

Original languageEnglish
Pages (from-to)315-329
Number of pages15
Journal / PublicationIEEE Transactions on Neural Networks and Learning Systems
Volume33
Issue number1
Online published27 Oct 2020
Publication statusPublished - Jan 2022

Abstract

Multilabel learning has been extensively studied in the past years, as it has many applications in different domains. It aims at annotating the labels for unseen data according to training data, which are often high dimensional in both instance and feature levels. The training data often have noisy and redundant information on these two levels. As an effective data preprocessing step, instance and feature selection should both be performed to find relevant training instances for each testing instance and relevant features for each label, respectively. However, most of the existing methods overlook the input-output correlation in each kind of selection. It will lead to the performance degradation. This article presents a formulation for multilabel learning from a topic view that exploits the dependence between features and labels in a topic space. We can perform effective instance and feature selection in the latent topic space, as the relationship between the input and output spaces is well captured in this space. The results from intensive experiments on various benchmarks demonstrate the effectiveness of the proposed framework.

Research Area(s)

  • Input–output correlation, instance and feature selection, multilabel learning, topic