Detection of non-native sentences using machine-translated training data

Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review

11 Scopus Citations
View graph of relations

Author(s)

Detail(s)

Original languageEnglish
Title of host publicationNAACL-HLT 2007 - Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, Companion Volume: Short Papers
PublisherAssociation for Computational Linguistics (ACL)
Pages93-96
Publication statusPublished - 2007
Externally publishedYes

Publication series

NameNAACL-HLT 2007 - Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, Companion Volume: Short Papers

Conference

Title2007 Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, NAACL-HLT 2007
PlaceUnited States
CityRochester
Period22 - 27 April 2007

Abstract

Training statistical models to detect nonnative sentences requires a large corpus of non-native writing samples, which is often not readily available. This paper examines the extent to which machinetranslated (MT) sentences can substitute as training data. Two tasks are examined. For the native vs non-native classification task, nonnative training data yields better performance; for the ranking task, however, models trained with a large, publicly available set of MT data perform as well as those trained with non-native data.

Bibliographic Note

Publication details (e.g. title, author(s), publication statuses and dates) are captured on an “AS IS” and “AS AVAILABLE” basis at the time of record harvesting from the data source. Suggestions for further amendments or supplementary information can be sent to [email protected].

Citation Format(s)

Detection of non-native sentences using machine-translated training data. / Lee, John; Zhou, Ming; Liu, Xiaohua.
NAACL-HLT 2007 - Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, Companion Volume: Short Papers. Association for Computational Linguistics (ACL), 2007. p. 93-96 (NAACL-HLT 2007 - Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, Companion Volume: Short Papers).

Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review