Analyzing who, what, and where in a mediaeval Chinese corpus : A case study on the Chinese Buddhist Canon
Research output: Chapters, Conference Papers, Creative and Literary Works › RGC 12 - Chapter in an edited book (Author) › peer-review
Author(s)
Related Research Unit(s)
Detail(s)
Original language | English |
---|---|
Title of host publication | Advances in Corpus Applications in Literary and Translation Studies |
Editors | Riccardo Moratto, Defeng Li |
Place of Publication | London |
Publisher | Routledge |
Pages | 81-102 |
ISBN (electronic) | 9781003298328 |
ISBN (print) | 9781032287386, 9781032287409 |
Publication status | Published - 2023 |
Publication series
Name | Routledge Advances in Translation and Interpreting Studies |
---|
Link(s)
Abstract
Information extraction from historical text is challenging because of the lack of data to train natural language processing tools. This chapter evaluates the utility of in-domain training data for data-driven profiling of characters, verbs, and toponyms and reports a case study on a corpus of Chinese Buddhist text. As is typical for such a corpus, the Chinese Buddhist Canon has few annotated linguistic resources other than lexica of names, places, and domain-specific terms. We apply a lexicon-based approach for named entity recognition and then report an analysis of the “who,” “what,” and “where” of the Canon: who the characters were, what they did, and where they were. Experimental results also show that even a small amount of word segmentation, part-of-speech, and dependency annotation can improve accuracy in named entity recognition and in extraction of character-verb associations.
Citation Format(s)
Analyzing who, what, and where in a mediaeval Chinese corpus: A case study on the Chinese Buddhist Canon. / Wong, Tak-sum; Lee, John Sie Yuen.
Advances in Corpus Applications in Literary and Translation Studies. ed. / Riccardo Moratto; Defeng Li. London: Routledge , 2023. p. 81-102 (Routledge Advances in Translation and Interpreting Studies).
Advances in Corpus Applications in Literary and Translation Studies. ed. / Riccardo Moratto; Defeng Li. London: Routledge , 2023. p. 81-102 (Routledge Advances in Translation and Interpreting Studies).
Research output: Chapters, Conference Papers, Creative and Literary Works › RGC 12 - Chapter in an edited book (Author) › peer-review