Abstract
To promote efficient learning of Chinese characters, pedagogical materials may present not only a single character, but a set of characters that are related in meaning and in written form. This paper investigates automatic construction of these character sets. The proposed model represents a character as averaged word vectors of common words containing the character. It then identifies sets of characters with high semantic similarity through clustering. Human evaluation shows that this representation outperforms direct use of character embeddings, and that the resulting character sets capture distinct semantic ranges.
Original language | English |
---|---|
Title of host publication | Proceedings of the 16th Workshop on Innovative Use of NLP for Building Educational Applications |
Publisher | Association for Computational Linguistics |
Pages | 59–63 |
ISBN (Electronic) | 978-1-954085-11-4 |
ISBN (Print) | 9781954085114 |
Publication status | Published - 20 Apr 2021 |
Event | 16th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2021) - Virtual Duration: 20 Apr 2021 → … https://sig-edu.org/bea/current |
Conference
Conference | 16th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2021) |
---|---|
Abbreviated title | BEA2021 |
Period | 20/04/21 → … |
Internet address |
Publisher's Copyright Statement
- This full text is made available under CC-BY 4.0. https://creativecommons.org/licenses/by/4.0/