Robustness of Hybrid Models in Cross-domain Readability Assessment

Research output: Chapters, Conference Papers, Creative and Literary Works (RGC: 12, 32, 41, 45)32_Refereed conference paper (with ISBN/ISSN)peer-review

View graph of relations

Detail(s)

Original languageEnglish
Title of host publicationProceedings of the 20th Workshop of the Australasian Language Technology Association
Subtitle of host publicationALTA 2022
PublisherAustralasian Language Technology Association
Pages62-67
Publication statusPublished - Dec 2022

Conference

Title20th Annual Workshop of the Australasian Language Technology Association (ALTA 2022)
LocationFlinders University
PlaceAustralia
CityAdelaide
Period14 - 16 December 2022

Abstract

Recent studies in automatic readability assessment have shown that hybrid models — models that leverage both linguistically motivated features and neural models — can outperform neural models. However, most evaluations on hybrid models have been based on in-domain data in English. This paper provides further evidence on the contribution of linguistic features by reporting the first direct comparison between hybrid, neural and linguistic models on cross-domain data. In experiments on a Chinese dataset, the hybrid model outperforms the neural model on both in-domain and cross-domain data. Importantly, the hybrid model exhibits much smaller performance degradation in the cross-domain setting, suggesting that the linguistic features are more robust and can better capture salient indicators of text difficulty.

Citation Format(s)

Robustness of Hybrid Models in Cross-domain Readability Assessment. / Lim, Ho Hung; Cai, Tianyuan; Lee, John S. Y. et al.

Proceedings of the 20th Workshop of the Australasian Language Technology Association: ALTA 2022. Australasian Language Technology Association, 2022. p. 62-67.

Research output: Chapters, Conference Papers, Creative and Literary Works (RGC: 12, 32, 41, 45)32_Refereed conference paper (with ISBN/ISSN)peer-review