Robustness of Hybrid Models in Cross-domain Readability Assessment

Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review

5 Citations (Scopus)

Abstract

Recent studies in automatic readability assessment have shown that hybrid models — models that leverage both linguistically motivated features and neural models — can outperform neural models. However, most evaluations on hybrid models have been based on in-domain data in English. This paper provides further evidence on the contribution of linguistic features by reporting the first direct comparison between hybrid, neural and linguistic models on cross-domain data. In experiments on a Chinese dataset, the hybrid model outperforms the neural model on both in-domain and cross-domain data. Importantly, the hybrid model exhibits much smaller performance degradation in the cross-domain setting, suggesting that the linguistic features are more robust and can better capture salient indicators of text difficulty.
Original languageEnglish
Title of host publicationProceedings of the 20th Workshop of the Australasian Language Technology Association
Subtitle of host publicationALTA 2022
PublisherAustralasian Language Technology Association
Pages62-67
Publication statusPublished - Dec 2022
Event20th Annual Workshop of the Australasian Language Technology Association (ALTA 2022) - Flinders University, Adelaide, Australia
Duration: 14 Dec 202216 Dec 2022
https://alta2022.alta.asn.au/

Publication series

NameProceedings of the Australasian Language Technology Workshop
ISSN (Print)1834-7037

Conference

Conference20th Annual Workshop of the Australasian Language Technology Association (ALTA 2022)
PlaceAustralia
CityAdelaide
Period14/12/2216/12/22
Internet address

Fingerprint

Dive into the research topics of 'Robustness of Hybrid Models in Cross-domain Readability Assessment'. Together they form a unique fingerprint.

Cite this