On an efficient decomposition of LPC excitation for producing natural sounding speech

Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review

View graph of relations

Author(s)

Related Research Unit(s)

Detail(s)

Original languageEnglish
Title of host publicationIEEE TENCON'90: Conference proceedings
PublisherPubl by IEEE
Pages329-333
ISBN (print)879425563
Publication statusPublished - 1991

Conference

Title1990 IEEE Region 10 Conference on Computer and Communication Systems - IEEE TENCON '90
CityHong Kong
Period24 - 27 September 1990

Abstract

The authors present a new method of modeling the excitation to the linear predictive coding (LPC) synthesis filter at low and medium bit rates. For the speech segments with regular patterns, the excitation is composed of two sequences of pulses. The first sequence is generated in a way similar to the classical physical model that consists of a glottal filter with thinned coefficients driven by a set of pitch pulses. Both the glottal function and pitch pulses are determined using the analysis-by-synthesis technique with the mean square criterion. The auxiliary sequence consists of a few pulses to supplement the first sequence for further reducing the mean square error. For unvoiced speech segments, multipulse excitation is simply used to drive the synthesis filter. Based on real speech analysis, the model has a gain on signal-to-noise ratio (SNR) of 2-3 dB for voiced segments over the multipulse LPC using 0.8-2.5 pulses/ms.

Citation Format(s)

On an efficient decomposition of LPC excitation for producing natural sounding speech. / Leung, S. H.; Peng, L. F.; Wong, O. Y. et al.
IEEE TENCON'90: Conference proceedings. Publ by IEEE, 1991. p. 329-333.

Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review