Unsupervised Adverbial Identification in Modern Chinese Literature

Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review

2 Citations (Scopus)

Abstract

In many languages, adverbials can be derived from words of various parts-of-speech. In Chinese, the derivation may be marked either with the standard adverbial marker DI, or the non-standard marker DE. Since DE also serves double duty as the attributive marker, accurate identification of adverbials requires disambiguation of its syntactic role. As parsers are trained predominantly on texts using the standard adverbial marker DI, they often fail to recognize adverbials suffixed with the non-standard DE. This paper addresses this problem with an unsupervised, rule-based approach for adverbial identification that utilizes dependency tree patterns. Experiment results show that this approach outperforms a masked language model baseline. We apply this approach to analyze standard and non-standard adverbial marker usage in modern Chinese literature.
Original languageEnglish
Title of host publicationProceedings of the 5th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature
EditorsStefania Degaetano-Ortlieb, Anna Kazantseva, Nils Reiter, Stan Szpakowicz
PublisherAssociation for Computational Linguistics
Pages91-95
Number of pages5
ISBN (Print)9781954085916
DOIs
Publication statusPublished - Nov 2021
Event5th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature (LaTeCHCLfL 2021) - Virtual, Punta Cana, Dominican Republic
Duration: 7 Nov 202111 Nov 2021
https://aclanthology.org/2021.latechclfl-1
https://sighum.wordpress.com/events/latech-clfl-2021/

Publication series

NameJoint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature, LaTeCHCLfL - Co-located with the Conference on Empirical Methods in Natural Language Processing, EMNLP - Proceedings

Conference

Conference5th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature (LaTeCHCLfL 2021)
Abbreviated titleLaTeCH-CLfL 2021
Country/TerritoryDominican Republic
CityPunta Cana
Period7/11/2111/11/21
Internet address

Fingerprint

Dive into the research topics of 'Unsupervised Adverbial Identification in Modern Chinese Literature'. Together they form a unique fingerprint.

Cite this