Skip to main navigation Skip to search Skip to main content

中文分词十年又回顾: 2007-2017

Translated title of the contribution: Chinese Word Segmewtation: Another Decade Review (2007-2017)

Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 12 - Chapter in an edited book (Author)

Abstract

This paper reviews the development of Chinese word segmentation (CWS) in the most recent decade, 2007-2017. Special attention was paid to the deep learning technologies that has already permeated into most areas of natural language processing (NLP). The basic view we have arrived at is that compared to traditional supervised learning methods, neural network based methods have not shown any superior performance. The most critical challenge still lies on balancing of recognition of in-vocabulary (IV) and out-of-vocabulary (OOV) words. However, as neural models have potentials to capture the essential linguistic structure of natural language, we are optimistic about significant progresses may arrive in the near future.
Translated title of the contributionChinese Word Segmewtation: Another Decade Review (2007-2017)
Original languageChinese (Simplified)
Title of host publication实证和语料库语言学前沿
Editors 揭春雨, 刘美君
Publisher中国社会科学出版社
Chapter5
Pages139-162
Number of pages24
ISBN (Print)978-7-5203-2831-9
Publication statusPublished - Sept 2018

Research Keywords

  • 中文分词
  • 神经网络
  • Chinese word segmentation
  • neural networks

Fingerprint

Dive into the research topics of 'Chinese Word Segmewtation: Another Decade Review (2007-2017)'. Together they form a unique fingerprint.

Cite this