Universal Dependencies for Mandarin Chinese

Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review

8 Scopus Citations
View graph of relations

Author(s)

Detail(s)

Original languageEnglish
Pages (from-to)673-710
Journal / PublicationLanguage Resources and Evaluation
Volume57
Issue number2
Online published24 Nov 2021
Publication statusPublished - Jun 2023

Abstract

This article presents a Universal Dependency (UD) annotation scheme for Mandarin Chinese, as well as the current UD Chinese HK treebank. Our focus is mainly on parts-of-speech tags and syntactic relations, with a quite large array of phenomena investigated. The main goal is to make transparent the linguistic consideration behind our annotation choices, and show how we articulated these choices with the criteria of Universal Dependencies. This scheme has been developed with reference to two other dependency schemes for this language, i.e. the Chinese Stanford Dependencies (Chang et al., 2009) and the Chinese Dependency Treebank (HIT-SCIR, 2010). We provide mappings between our scheme and the two others. The content of the UD Chinese HK treebank is discussed in relation to the other UD treebanks for Chinese, and the inter-annotator agreement on POS and dependency annotation is reported. Our proposed scheme is motivated by reasoned linguistic analysis, is suitable for cross-linguistic comparison, and produced a high level of agreement between annotators.

© The Author(s), under exclusive licence to Springer Nature B.V. 2021, corrected publication 2022

Research Area(s)

  • Chinese, Universal dependencies, Treebank, Annotation scheme

Citation Format(s)

Universal Dependencies for Mandarin Chinese. / Poiret, Rafaël; Wong, Tak-Sum; Lee, John et al.
In: Language Resources and Evaluation, Vol. 57, No. 2, 06.2023, p. 673-710.

Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review