Semantic Modeling for Sentence-level Readability Assessment
DescriptionEffective communication through text is critical for an information society. Written materials intended for the general public are supposed to be easily understood by the average reader, but that is often not the case. In principle, automatic readability assessment should be helpful in promoting the use of clear and plain language, but current readability models do not give sufficient feedback for this purpose. Their feedback typically consists of a single score, expressing the school grade or proficiency level needed for the reader to understand the document. This document-level score provides no guidance as to which sentences, and what aspects in these sentences, are difficult to understand. To address this need, this project aims to build a readability assessment system that works at the sentence level. Document-level readability models cannot be directly applied on sentences because they rely mostly on surface cues, such as lexical and syntactic features. While these features can provide reliable aggregate statistics on a document, they work less well on single sentences, which offer fewer cues for readability prediction. Accurate readability assessment on a sentence likely requires deeper semantic analysis to model its meaning and cohesion with the context. We propose a sentence-level readability assessment system that incorporates semantic features, and evaluate its performance on Chinese documents. We exploit a state-of-the-art Chinese semantic parser and database of semantic verb frames to build this system. This project will answer two research questions: (1) whether semantic modeling improves accuracy in readability prediction at the sentence level, and (2) whether it facilitates more efficient and effective text revision. Our findings will have implications on future design of readability models, as well as computer-assisted text revision and simplification systems.
|Effective start/end date||1/01/21 → …|