Automatic Generation of Vocabulary Lists with Multiword Expressions

John S. Y. Lee, Adilet Uvaliyev

Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review

1 Citation (Scopus)
67 Downloads (CityUHK Scholars)

Abstract

The importance of multiword expressions (MWEs) for language learning is well established. While MWE research has been evaluated on various downstream tasks such as syntactic parsing and machine translation, its applications in computer-assisted language learning has been less explored. This paper investigates the selection of MWEs for graded vocabulary lists. Widely used by language teachers and students, these lists recommend a language acquisition sequence to optimize learning efficiency. We automatically generate these lists using difficulty-graded corpora and MWEs extracted based on semantic compositionality. We evaluate these lists on their ability to facilitate text comprehension for learners. Experimental results show that our proposed method generates higher-quality lists than baselines using collocation measures. ©2023 Association for Computational Linguistics
Original languageEnglish
Title of host publicationThe 19th Workshop on Multiword Expressions (MWE 2023)
Subtitle of host publicationProceedings of the Workshop
PublisherAssociation for Computational Linguistics
Pages81–86
ISBN (Electronic)978-1-959429-59-3
DOIs
Publication statusPublished - 6 May 2023
Event19th Workshop on Multiword Expressions (MWE 2023) - Hybrid, Dubrovnik, Croatia
Duration: 6 May 20236 May 2023
https://multiword.org/mwe2023/

Publication series

NameWorkshop on Multiword Expressions, MWE - Proceedings

Conference

Conference19th Workshop on Multiword Expressions (MWE 2023)
Country/TerritoryCroatia
CityDubrovnik
Period6/05/236/05/23
Internet address

Publisher's Copyright Statement

  • This full text is made available under CC-BY 4.0. https://creativecommons.org/licenses/by/4.0/

Fingerprint

Dive into the research topics of 'Automatic Generation of Vocabulary Lists with Multiword Expressions'. Together they form a unique fingerprint.

Cite this