Noisy Label Removal for Partial Multi-Label Learning

Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review

View graph of relations

Author(s)

  • Fuchao Yang
  • Yuheng Jia
  • Hui Liu
  • Yongqiang Dong
  • Junhui Hou

Related Research Unit(s)

Detail(s)

Original languageEnglish
Title of host publicationKDD '24
Subtitle of host publicationProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
PublisherAssociation for Computing Machinery
Pages3724-3735
ISBN (print)979-8-4007-0490-1
Publication statusPublished - Aug 2024

Publication series

NameProceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
ISSN (Print)2154-817X

Conference

Title30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2024)
LocationCentre de Convencions Internacional de Barcelona
PlaceSpain
CityBarcelona
Period25 - 29 August 2024

Abstract

This paper addresses the problem of partial multi-label learning (PML), a challenging weakly supervised learning framework, where each sample is associated with a candidate label set comprising both ground-true labels and noisy labels. We theoretically reveal that an increased number of noisy labels in the candidate label set leads to an enlarged generalization error bound, consequently degrading the classification performance. Accordingly, the key to solving PML lies in accurately removing the noisy labels within the candidate label set. To achieve this objective, we leverage prior knowledge about the noisy labels in PML, which suggests that they only exist within the candidate label set and possess binary values. Specifically, we propose a constrained regression model to learn a PML classifier and select the noisy labels. The constraints of the model strictly enforce the location and value of the noisy labels. Simultaneously, the supervision information provided by the candidate label set is unreliable due to the presence of noisy labels. In contrast, the non-candidate labels of a sample precisely indicate the classes to which the sample does not belong. To aid in the selection of noisy labels, we construct a competitive classifier based on the non-candidate labels. The PML classifier and the competitive classifier form a competitive relationship, encouraging mutual learning. We formulate the proposed model as a discrete optimization problem to effectively remove the noisy labels, and we solve it using an alternative algorithm. Extensive experiments conducted on 6 real-world partial multi-label data sets and 7 synthetic data sets, employing various evaluation metrics, demonstrate that our method significantly outperforms state-of-the-art PML methods. The code implementation is publicly available at https://github.com/Yangfc-ML/NLR. © 2024 Copyright held by the owner/author(s).

Research Area(s)

  • multi-label learning, partial label learning, partial multi-label learning

Citation Format(s)

Noisy Label Removal for Partial Multi-Label Learning. / Yang, Fuchao; Jia, Yuheng; Liu, Hui et al.
KDD '24: Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. Association for Computing Machinery, 2024. p. 3724-3735 (Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining).

Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review