Enhanced language modelling with phonologically constrained morphological analysis

A. C. Fang, M. Huckvale

Research output: Journal Publications and ReviewsRGC 22 - Publication in policy or professional journal

Abstract

Phonologically constrained morphological analysis (PCMA) is the decomposition of words into their component morphemes conditioned by both orthography and pronunciation. This article describes PCMA and its application in large-vocabulary continuous speech recognition to enhance recognition performance in some tasks. Our experiments, based on the British National Corpus and the LOB Corpus for training data and WSJCAM0 for test data, show clearly that PCMA leads to smaller lexicon size, smaller language models, superior word lattices and a decrease in word error rates. PCMA seems to show most benefit in open vocabulary tasks, where the productivity of a morph unit lexicon makes a substantial reduction in out-of-vocabulary rates.
Original languageEnglish
Pages (from-to)1711-1714
JournalICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Volume3
DOIs
Publication statusPublished - 2000
Externally publishedYes
Event2000 IEEE Interntional Conference on Acoustics, Speech, and Signal Processing - Istanbul, Türkiye
Duration: 5 Jun 20009 Jun 2000

Fingerprint

Dive into the research topics of 'Enhanced language modelling with phonologically constrained morphological analysis'. Together they form a unique fingerprint.

Cite this