Abstract
This paper describes a novel character tagging approach to Chinese word segmentation and named entity recognition (NER) for our participation in Bakeoff-4.1 It integrates unsupervised segmentation and conditional random fields (CRFs) learning successfully, using similar character tags and feature templates for both word segmentation and NER. It ranks at the top in all closed tests of word segmentation and gives promising results for all closed and open NER tasks in the Bakeoff. Tag set selection and unsupervised segmentation play a critical role in this success. © IJCNLP 2008. All rights reserved.
Original language | English |
---|---|
Title of host publication | IJCNLP 2008 - Sixth SIGHAN Workshop on Chinese Language Processing - Proceedings of the Workshop |
Publisher | Association for Computational Linguistics |
Pages | 106-111 |
Publication status | Published - 11 Jan 2008 |
Event | 6th SIGHAN Workshop on Chinese Language Processing (SIGHAN 2008) - Hyderabad, India Duration: 11 Jan 2008 → 12 Jan 2008 https://aclanthology.org/volumes/I08-4/ |
Publication series
Name | SIGHAN - SIGHAN Workshop on Chinese Language Processing, co-located with International Joint Conference on Natural Language Processing, IJCNLP |
---|
Conference
Conference | 6th SIGHAN Workshop on Chinese Language Processing (SIGHAN 2008) |
---|---|
Abbreviated title | SIGHAN-6 |
Country/Territory | India |
City | Hyderabad |
Period | 11/01/08 → 12/01/08 |
Internet address |