Abstract
This paper addresses the problem of partial multi-label learning (PML), a challenging weakly supervised learning framework, where each sample is associated with a candidate label set comprising both ground-true labels and noisy labels. We theoretically reveal that an increased number of noisy labels in the candidate label set leads to an enlarged generalization error bound, consequently degrading the classification performance. Accordingly, the key to solving PML lies in accurately removing the noisy labels within the candidate label set. To achieve this objective, we leverage prior knowledge about the noisy labels in PML, which suggests that they only exist within the candidate label set and possess binary values. Specifically, we propose a constrained regression model to learn a PML classifier and select the noisy labels. The constraints of the model strictly enforce the location and value of the noisy labels. Simultaneously, the supervision information provided by the candidate label set is unreliable due to the presence of noisy labels. In contrast, the non-candidate labels of a sample precisely indicate the classes to which the sample does not belong. To aid in the selection of noisy labels, we construct a competitive classifier based on the non-candidate labels. The PML classifier and the competitive classifier form a competitive relationship, encouraging mutual learning. We formulate the proposed model as a discrete optimization problem to effectively remove the noisy labels, and we solve it using an alternative algorithm. Extensive experiments conducted on 6 real-world partial multi-label data sets and 7 synthetic data sets, employing various evaluation metrics, demonstrate that our method significantly outperforms state-of-the-art PML methods. The code implementation is publicly available at https://github.com/Yangfc-ML/NLR. © 2024 Copyright held by the owner/author(s).
| Original language | English |
|---|---|
| Title of host publication | KDD '24 |
| Subtitle of host publication | Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining |
| Publisher | Association for Computing Machinery |
| Pages | 3724-3735 |
| ISBN (Print) | 979-8-4007-0490-1 |
| DOIs | |
| Publication status | Published - Aug 2024 |
| Event | 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2024) - Centre de Convencions Internacional de Barcelona, Barcelona, Spain Duration: 25 Aug 2024 → 29 Aug 2024 https://kdd2024.kdd.org/ https://dl.acm.org/conference/kdd/proceedings |
Publication series
| Name | Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining |
|---|---|
| ISSN (Print) | 2154-817X |
Conference
| Conference | 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2024) |
|---|---|
| Abbreviated title | ACM KDD 2024 |
| Place | Spain |
| City | Barcelona |
| Period | 25/08/24 → 29/08/24 |
| Internet address |
Funding
This work was supported in part by the National Natural Science Foundation of China under Grant 62106044, in part by the Natural Science Foundation of Jiangsu Province under Grant BK20210221, and in part by the Hong Kong UGC under grant UGC/FDS11/E02/22.
Research Keywords
- multi-label learning
- partial label learning
- partial multi-label learning
RGC Funding Information
- RGC-funded
Fingerprint
Dive into the research topics of 'Noisy Label Removal for Partial Multi-Label Learning'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver