L1-L2 Parallel Treebank of Learner Chinese: Overused and Underused Syntactic Structures

Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review

1 Citation (Scopus)

Abstract

We present a preliminary analysis on a corpus of texts written by learners of Chinese as a foreign language (CFL), annotated in the form of an L1-L2 parallel dependency treebank. The treebank consists of parse trees of sentences written by CFL learners (“L2 sentences”), parse trees of their target hypotheses (“L1 sentences”), and word alignment between the L1 sentences and L2 sentences. Currently, the treebank consists of 600 L2 sentences and 697 L1 sentences. We report the most overused and underused syntactic relations by the CFL learners, and discuss the underlying learner errors.
Original languageEnglish
Title of host publicationLREC 2018, Eleventh International Conference on Language Resources and Evaluation Proceedings
EditorsNicoletta Calzolari, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Koiti Hasida, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis, Takenobu Tokunaga
PublisherEuropean Language Resources Association (ELRA)
Pages4106-4110
ISBN (Print)9791095546009
Publication statusPublished - May 2018
EventEleventh International Conference on Language Resources and Evaluation (LREC 2018) - Phoenix Seagaia Conference Center, Miyazaki, Japan
Duration: 7 May 201812 May 2018
http://www.lrec-conf.org/proceedings/lrec2018/index.html
http://www.lrec-conf.org/proceedings/lrec2018/pdf/book_of_abstracts.pdf
http://www.lrec-conf.org/proceedings/lrec2018/summaries/901.html

Publication series

NameLREC 2018, Eleventh International Conference on Language Resources and Evaluation

Conference

ConferenceEleventh International Conference on Language Resources and Evaluation (LREC 2018)
Abbreviated titleLREC 2018
PlaceJapan
CityMiyazaki
Period7/05/1812/05/18
Internet address

Research Keywords

  • learner corpus
  • parallel treebank
  • Chinese as a foreign language

Fingerprint

Dive into the research topics of 'L1-L2 Parallel Treebank of Learner Chinese: Overused and Underused Syntactic Structures'. Together they form a unique fingerprint.

Cite this