Abstract
We present a preliminary analysis on a corpus of texts written by learners of Chinese as a foreign language (CFL), annotated in the form of an L1-L2 parallel dependency treebank. The treebank consists of parse trees of sentences written by CFL learners (“L2 sentences”), parse trees of their target hypotheses (“L1 sentences”), and word alignment between the L1 sentences and L2 sentences. Currently, the treebank consists of 600 L2 sentences and 697 L1 sentences. We report the most overused and underused syntactic relations by the CFL learners, and discuss the underlying learner errors.
| Original language | English |
|---|---|
| Title of host publication | LREC 2018, Eleventh International Conference on Language Resources and Evaluation Proceedings |
| Editors | Nicoletta Calzolari, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Koiti Hasida, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis, Takenobu Tokunaga |
| Publisher | European Language Resources Association (ELRA) |
| Pages | 4106-4110 |
| ISBN (Print) | 9791095546009 |
| Publication status | Published - May 2018 |
| Event | Eleventh International Conference on Language Resources and Evaluation (LREC 2018) - Phoenix Seagaia Conference Center, Miyazaki, Japan Duration: 7 May 2018 → 12 May 2018 http://www.lrec-conf.org/proceedings/lrec2018/index.html http://www.lrec-conf.org/proceedings/lrec2018/pdf/book_of_abstracts.pdf http://www.lrec-conf.org/proceedings/lrec2018/summaries/901.html |
Publication series
| Name | LREC 2018, Eleventh International Conference on Language Resources and Evaluation |
|---|
Conference
| Conference | Eleventh International Conference on Language Resources and Evaluation (LREC 2018) |
|---|---|
| Abbreviated title | LREC 2018 |
| Place | Japan |
| City | Miyazaki |
| Period | 7/05/18 → 12/05/18 |
| Internet address |
Research Keywords
- learner corpus
- parallel treebank
- Chinese as a foreign language