Skip to main navigation Skip to search Skip to main content

Transductive HMM based Chinese text chunking

Heng Li, Jonathan J. Webster, Chunyu Kit, Tianshun Yao

    Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review

    Abstract

    In this paper. we present a novel methodology to enhance Chinese test chunking with the aid of transductive Hidden Markov Models (transductive HMMs, henceforth). We consider chunking as a special tagging problem and attempt to utilize, via a number of transformation functions, as much relevant contextual information as possible for model training. These functions enable the models to make use of contextual information to a greater extent and keep us away from costly changes of the original training and tagging process. Each of them results in an individual model with certain pros and cons. Through a number of esperiments, we succeed in integrating the best two models into a significantly better one. We carry out the chunking experiments on the HIT Chinese Treebank corpus. Experimental results show that it is an effective approach, achieving an F score of 82.38%.
    Original languageEnglish
    Title of host publicationNLP-KE 2003 - 2003 International Conference on Natural Language Processing and Knowledge Engineering, Proceedings
    PublisherIEEE
    Pages257-262
    ISBN (Print)0780379020, 9780780379022
    DOIs
    Publication statusPublished - 2003
    EventInternational Conference on Natural Language Processing and Knowledge Engineering, NLP-KE 2003 - Beijing, China
    Duration: 26 Oct 200329 Oct 2003

    Conference

    ConferenceInternational Conference on Natural Language Processing and Knowledge Engineering, NLP-KE 2003
    PlaceChina
    CityBeijing
    Period26/10/0329/10/03

    Research Keywords

    • Text chunking
    • Transductive HMM

    Fingerprint

    Dive into the research topics of 'Transductive HMM based Chinese text chunking'. Together they form a unique fingerprint.

    Cite this