Noisy Label Removal for Partial Multi-Label Learning
Research output: Chapters, Conference Papers, Creative and Literary Works › RGC 32 - Refereed conference paper (with host publication) › peer-review
Author(s)
Related Research Unit(s)
Detail(s)
Original language | English |
---|---|
Title of host publication | KDD '24 |
Subtitle of host publication | Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining |
Publisher | Association for Computing Machinery |
Pages | 3724-3735 |
ISBN (print) | 979-8-4007-0490-1 |
Publication status | Published - Aug 2024 |
Publication series
Name | Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining |
---|---|
ISSN (Print) | 2154-817X |
Conference
Title | 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2024) |
---|---|
Location | Centre de Convencions Internacional de Barcelona |
Place | Spain |
City | Barcelona |
Period | 25 - 29 August 2024 |
Link(s)
Abstract
This paper addresses the problem of partial multi-label learning (PML), a challenging weakly supervised learning framework, where each sample is associated with a candidate label set comprising both ground-true labels and noisy labels. We theoretically reveal that an increased number of noisy labels in the candidate label set leads to an enlarged generalization error bound, consequently degrading the classification performance. Accordingly, the key to solving PML lies in accurately removing the noisy labels within the candidate label set. To achieve this objective, we leverage prior knowledge about the noisy labels in PML, which suggests that they only exist within the candidate label set and possess binary values. Specifically, we propose a constrained regression model to learn a PML classifier and select the noisy labels. The constraints of the model strictly enforce the location and value of the noisy labels. Simultaneously, the supervision information provided by the candidate label set is unreliable due to the presence of noisy labels. In contrast, the non-candidate labels of a sample precisely indicate the classes to which the sample does not belong. To aid in the selection of noisy labels, we construct a competitive classifier based on the non-candidate labels. The PML classifier and the competitive classifier form a competitive relationship, encouraging mutual learning. We formulate the proposed model as a discrete optimization problem to effectively remove the noisy labels, and we solve it using an alternative algorithm. Extensive experiments conducted on 6 real-world partial multi-label data sets and 7 synthetic data sets, employing various evaluation metrics, demonstrate that our method significantly outperforms state-of-the-art PML methods. The code implementation is publicly available at https://github.com/Yangfc-ML/NLR. © 2024 Copyright held by the owner/author(s).
Research Area(s)
- multi-label learning, partial label learning, partial multi-label learning
Citation Format(s)
Noisy Label Removal for Partial Multi-Label Learning. / Yang, Fuchao; Jia, Yuheng; Liu, Hui et al.
KDD '24: Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. Association for Computing Machinery, 2024. p. 3724-3735 (Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining).
KDD '24: Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. Association for Computing Machinery, 2024. p. 3724-3735 (Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining).
Research output: Chapters, Conference Papers, Creative and Literary Works › RGC 32 - Refereed conference paper (with host publication) › peer-review