Skip to main navigation Skip to search Skip to main content

Noisy Label Removal for Partial Multi-Label Learning

Fuchao Yang, Yuheng Jia*, Hui Liu, Yongqiang Dong, Junhui Hou

*Corresponding author for this work

Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review

Abstract

This paper addresses the problem of partial multi-label learning (PML), a challenging weakly supervised learning framework, where each sample is associated with a candidate label set comprising both ground-true labels and noisy labels. We theoretically reveal that an increased number of noisy labels in the candidate label set leads to an enlarged generalization error bound, consequently degrading the classification performance. Accordingly, the key to solving PML lies in accurately removing the noisy labels within the candidate label set. To achieve this objective, we leverage prior knowledge about the noisy labels in PML, which suggests that they only exist within the candidate label set and possess binary values. Specifically, we propose a constrained regression model to learn a PML classifier and select the noisy labels. The constraints of the model strictly enforce the location and value of the noisy labels. Simultaneously, the supervision information provided by the candidate label set is unreliable due to the presence of noisy labels. In contrast, the non-candidate labels of a sample precisely indicate the classes to which the sample does not belong. To aid in the selection of noisy labels, we construct a competitive classifier based on the non-candidate labels. The PML classifier and the competitive classifier form a competitive relationship, encouraging mutual learning. We formulate the proposed model as a discrete optimization problem to effectively remove the noisy labels, and we solve it using an alternative algorithm. Extensive experiments conducted on 6 real-world partial multi-label data sets and 7 synthetic data sets, employing various evaluation metrics, demonstrate that our method significantly outperforms state-of-the-art PML methods. The code implementation is publicly available at https://github.com/Yangfc-ML/NLR. © 2024 Copyright held by the owner/author(s).
Original languageEnglish
Title of host publicationKDD '24
Subtitle of host publicationProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
PublisherAssociation for Computing Machinery
Pages3724-3735
ISBN (Print)979-8-4007-0490-1
DOIs
Publication statusPublished - Aug 2024
Event30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2024) - Centre de Convencions Internacional de Barcelona, Barcelona, Spain
Duration: 25 Aug 202429 Aug 2024
https://kdd2024.kdd.org/
https://dl.acm.org/conference/kdd/proceedings

Publication series

NameProceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
ISSN (Print)2154-817X

Conference

Conference30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2024)
Abbreviated titleACM KDD 2024
PlaceSpain
CityBarcelona
Period25/08/2429/08/24
Internet address

Funding

This work was supported in part by the National Natural Science Foundation of China under Grant 62106044, in part by the Natural Science Foundation of Jiangsu Province under Grant BK20210221, and in part by the Hong Kong UGC under grant UGC/FDS11/E02/22.

Research Keywords

  • multi-label learning
  • partial label learning
  • partial multi-label learning

RGC Funding Information

  • RGC-funded

Fingerprint

Dive into the research topics of 'Noisy Label Removal for Partial Multi-Label Learning'. Together they form a unique fingerprint.

Cite this