Abstract
With the rapid accumulation of text data produced by data-driven techniques, the task of extracting "data annotations"-concise, high-quality data summaries from unstructured raw text-has become increasingly important. The recent advances in weak supervision and crowd-sourcing techniques provide promising solutions to efficiently create annotations (labels) for large-scale technical text data. However, such annotations may fail in practice because of the change in annotation requirements, application scenarios, and modeling goals, where label validation and relabeling by domain experts are required. To approach this issue, we present LabelVizier, a human-in-The-loop workflow that incorporates domain knowledge and user-specific requirements to reveal actionable insights into annotation flaws, then produce better-quality labels for large-scale multi-label datasets. We implement our workflow as an interactive notebook to facilitate flexible error profiling, in-depth annotation validation for three error types, and efficient annotation relabeling on different data scales. We evaluated our workflow in assisting the validation and relabelling of technical text annotation with two use cases and four expert reviews. The results show that LabelVizier is applicable in various application scenarios, and users with different knowledge backgrounds have diverse preferences for the tool usage. © 2023 IEEE.
| Original language | English |
|---|---|
| Title of host publication | Proceedings - 2023 IEEE 16th Pacific Visualization Symposium |
| Subtitle of host publication | PacificVis 2023 |
| Publisher | IEEE |
| Pages | 167-176 |
| ISBN (Electronic) | 9798350321241 |
| ISBN (Print) | 979-8-3503-2125-8 |
| DOIs | |
| Publication status | Published - 2023 |
| Externally published | Yes |
| Event | 16th IEEE Pacific Visualization Symposium (PacificVis 2023) - Seoul Tourism Plaza, Seoul, Korea, Republic of Duration: 18 Apr 2023 → 21 Apr 2023 https://pvis2023.github.io/pvis2023/ |
Publication series
| Name | IEEE Pacific Visualization Symposium |
|---|---|
| ISSN (Print) | 2165-8765 |
| ISSN (Electronic) | 2165-8773 |
Conference
| Conference | 16th IEEE Pacific Visualization Symposium (PacificVis 2023) |
|---|---|
| Place | Korea, Republic of |
| City | Seoul |
| Period | 18/04/23 → 21/04/23 |
| Internet address |
Research Keywords
- Data Annotation
- Model Interpretation
- Technical Language Processing
- Workflow Design
Fingerprint
Dive into the research topics of 'LabelVizier: Interactive Validation and Relabeling for Technical Text Annotations'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver