Abstract
Automatic readability assessment (ARA) predicts how difficult it is for the reader to understand a text. While ARA has traditionally been performed at the passage level, there has been increasing interest in ARA at the sentence level, given its applications in downstream tasks such as text simplification and language exercise generation. Recent research has suggested the effectiveness of hybrid approaches for ARA, but they have yet to be applied on the sentence level. We present the first study that compares neural and hybrid models for sentence-level ARA. We conducted experiments on graded sentences from the Wall Street Journal (WSJ) and a dataset derived from the OneStopEnglish corpus. Experimental results show that both neural and hybrid models outperform traditional classifiers trained on linguistic features. Hybrid models obtained the best accuracy on both datasets, surpassing the previous best result reported on the WSJ dataset by almost 13% absolute.
© 2023 Association for Computational Linguistics
© 2023 Association for Computational Linguistics
Original language | English |
---|---|
Title of host publication | Proceedings of the 18th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2023) |
Publisher | Association for Computational Linguistics |
Pages | 448-454 |
DOIs | |
Publication status | Published - 13 Jul 2023 |
Event | 18th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2023) - Hybrid, Toronto, Canada Duration: 13 Jul 2023 → 13 Jul 2023 https://sig-edu.org/bea/2023 |
Publication series
Name | Proceedings of the Annual Meeting of the Association for Computational Linguistics |
---|---|
ISSN (Print) | 0736-587X |
Conference
Conference | 18th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2023) |
---|---|
Country/Territory | Canada |
City | Toronto |
Period | 13/07/23 → 13/07/23 |
Internet address |