SCA-CLS: A New Semantic-Context-Aware Framework for Community-Oriented Lexical Simplification

Rongying Li, Wenxiu Xie, John Lee, Tianyong Hao*

*Corresponding author for this work

Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review

Abstract

Community-oriented lexical simplification aims to transform complex words within a sentence into semantically consistent but simple substitute words from a community-specific vocabulary. Most state-of-the-art contextual word embedding models generate substitutes by extracting contextual information of complex words. Although these models take context into account, they fail to capture rich semantics of complex words with polysemy, resulting in many spurious and semantically non-equivalent candidates. Thus, this paper proposes a novel Semantic-Context-Aware framework for Community-oriented Lexical Simplification (SCA-CLS), which integrates gloss (sense definition) into BERT to identify the actual sense of the complex word (especially for polysemy) in current context and ranks substitutes by proposed gloss similarity. In addition, a new complexity feature is proposed to enhance substitute ranking. Experiment results on Wikipedia dataset show that SCA-CLS outperforms the state-of-the-art Merge-Sort model on both substitute generation and ranking tasks, indicating its effectiveness for community-oriented lexical simplification. © 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG.
Original languageEnglish
Title of host publicationNatural Language Processing and Chinese Computing - 12th National CCF Conference, NLPCC 2023, Proceedings, Part I
EditorsFei Liu, Nan Duan, Qingting Xu, Yu Hong
PublisherSpringer, Cham
Pages69-81
ISBN (Electronic)9783031446931
ISBN (Print)9783031446924
DOIs
Publication statusPublished - 2023
Event12th National CCF Conference on Natural Language Processing and Chinese Computing (NLPCC 2023) - Foshan, China
Duration: 12 Oct 202315 Oct 2023

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume14302 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference12th National CCF Conference on Natural Language Processing and Chinese Computing (NLPCC 2023)
Country/TerritoryChina
CityFoshan
Period12/10/2315/10/23

Research Keywords

  • BERT
  • Gloss
  • Lexical simplification
  • Ranking
  • Semantic

Fingerprint

Dive into the research topics of 'SCA-CLS: A New Semantic-Context-Aware Framework for Community-Oriented Lexical Simplification'. Together they form a unique fingerprint.

Cite this