Using Syntactic Parse Patterns in Grammatical Error Detection


Student thesis: Doctoral Thesis

View graph of relations



Awarding Institution
Award date8 May 2018


This dissertation explores the use of parse tree patterns in detecting grammatical errors in texts written by English learners. Most approaches to date on this task have made use of lexical and part-of-speech features. Much less explored is the use of parse patterns to capture syntactic structures that are indicative of these errors.

As a preliminary study, we first use hand-crafted parse patterns as features in a statistical model to detect comma splices in texts. Next, we develop a generalized approach that be can applied to different types of errors. Parse patterns indicative of the errors are discovered automatically during training, and thus eliminate the need to tailor a feature set for each specific error type. Finally, we explore whether syntactic parse patterns could be used to improve the performance of grammatical error detection in neural networks.

Through these experiments, we show that although parser performance degrades on learner texts, parsers can still be useful for identifying grammatical errors if they produce consistent patterns that indicate individual error types. We also present the first study that focuses on the detection of comma splice and sentence fragment, both of which are long-distance errors that are common among English learners. Using syntactic parse patterns as features, our models achieve state-of-the-art results for these two error types, and comparable performance on subject-verb agreement errors.