Enhancing Automatic Readability Assessment with Pre-training and Soft Labels for Ordinal Regression

Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review

4 Scopus Citations
View graph of relations

Author(s)

Detail(s)

Original languageEnglish
Title of host publicationFindings of the Association for Computational Linguistics
Subtitle of host publicationEMNLP 2022
EditorsYoav Goldberg, Zornitsa Kozareva, Yue Zhang
PublisherAssociation for Computational Linguistics
Pages4586-4597
Publication statusPublished - Dec 2022

Conference

Title2022 Conference on Empirical Methods in Natural Language Processing (EMNLP 2022)
LocationHybrid
PlaceUnited Arab Emirates
CityAbu Dhabi
Period7 - 11 December 2022

Link(s)

Abstract

The readability assessment task aims to assign a difficulty grade to a text. While neural models have recently demonstrated impressive performance, most do not exploit the ordinal nature of the difficulty grades, and make little effort for model initialization to facilitate fine-tuning. We address these limitations with soft labels for ordinal regression, and with model pre-training through prediction of pairwise relative text difficulty. We incorporate these two components into a model based on hierarchical attention networks, and evaluate its performance on both English and Chinese datasets. Experimental results show that our proposed model outperforms competitive neural models and statistical classifiers on most datasets.

Research Area(s)

Citation Format(s)

Enhancing Automatic Readability Assessment with Pre-training and Soft Labels for Ordinal Regression. / Zeng, Jinshan; Xie, Yudong; Yu, Xianglong et al.
Findings of the Association for Computational Linguistics: EMNLP 2022. ed. / Yoav Goldberg; Zornitsa Kozareva; Yue Zhang. Association for Computational Linguistics, 2022. p. 4586-4597.

Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review

Download Statistics

No data available