Modeling Sequential Annotations for Sequence Labeling With Crowds
Research output: Journal Publications and Reviews (RGC: 21, 22, 62) › 21_Publication in refereed journal › peer-review
Author(s)
Related Research Unit(s)
Detail(s)
Original language | English |
---|---|
Pages (from-to) | 2335-2345 |
Number of pages | 11 |
Journal / Publication | IEEE Transactions on Cybernetics |
Volume | 53 |
Issue number | 4 |
Online published | 19 Oct 2021 |
Publication status | Published - Apr 2023 |
Link(s)
Abstract
Crowd sequential annotations can be an efficient and cost-effective way to build large datasets for sequence labeling. Different from tagging independent instances, for crowd sequential annotations, the quality of label sequence relies on the expertise level of annotators in capturing internal dependencies for each token in the sequence. In this article, we propose modeling sequential annotation for sequence labeling with crowds (SA-SLC). First, a conditional probabilistic model is developed to jointly model sequential data and annotators' expertise, in which categorical distribution is introduced to estimate the reliability of each annotator in capturing local and nonlocal label dependencies for sequential annotation. To accelerate the marginalization of the proposed model, a valid label sequence inference (VLSE) method is proposed to derive the valid ground-truth label sequences from crowd sequential annotations. VLSE derives possible ground-truth labels from the tokenwise level and further prunes subpaths in the forward inference for label sequence decoding. VLSE reduces the number of candidate label sequences and improves the quality of possible ground-truth label sequences. The experimental results on several sequence labeling tasks of Natural Language Processing show the effectiveness of the proposed model.
© 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
© 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
Research Area(s)
- Annotations, Crowdsourcing, Data models, Hidden Markov models, Labeling, labeling consistency, nonlocal label dependency, Probabilistic logic, Reliability, sequential annotations, Task analysis
Citation Format(s)
Modeling Sequential Annotations for Sequence Labeling With Crowds. / Lu, Xiaolei; Chow, Tommy W. S.
In: IEEE Transactions on Cybernetics, Vol. 53, No. 4, 04.2023, p. 2335-2345.
In: IEEE Transactions on Cybernetics, Vol. 53, No. 4, 04.2023, p. 2335-2345.
Research output: Journal Publications and Reviews (RGC: 21, 22, 62) › 21_Publication in refereed journal › peer-review