Speech bandwidth enhancement using state space speech dynamics

Sheng Yao, Cheung-Fat Chan

Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review

5 Citations (Scopus)

Abstract

Extending narrowband speech (0-4 kHz) to wideband speech (0-8 kHz) has applications in telephone systems and speech recognition systems where wideband training speech data may not be available. A couple of methods have been proposed to retrieve the missing high-band information (4-8 kHz) from narrowband speech. Memoryless systems are likely to produce large hissing artifacts since mutual information between low-band (0-4 kHz) and high-band (4-8 kHz) spectra are actually quite low. Generally speaking, bandwidth extension cannot recover original high-band information but good approximates with less over-estimation of the high-band energy, which usually refers to hissing artifact, can be obtained by considering the neighboring speech frames. In this paper, we propose a new bandwidth extension system with memory by using a state-space model to capture the long-term speech dynamics. The model parameters can be trained in the sense of maximum likelihood (ML) and the enhancement is obtained via wideband state vector estimation and Kalman filtering. The performance in terms of spectral distortion is shown to be much better than other memoryless systems and is comparable with early Continuous Density Hidden Markov Model (CDHMM) memory system. The new state-space method is inherent sequential and has advantages of less processing delays and robustness against block detection errors. © 2006 IEEE.
Original languageEnglish
Title of host publication2006 IEEE International Conference on Acoustics, Speech and Signal Processing
Subtitle of host publicationProceedings
PublisherIEEE
PagesI 489-I 492
Volume1
ISBN (Print)9781424404698
DOIs
Publication statusPublished - 2006
Event2006 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2006) - Toulouse, France
Duration: 14 May 200619 May 2006

Publication series

Name
ISSN (Print)1520-6149
ISSN (Electronic)2379-190X

Conference

Conference2006 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2006)
PlaceFrance
CityToulouse
Period14/05/0619/05/06

Fingerprint

Dive into the research topics of 'Speech bandwidth enhancement using state space speech dynamics'. Together they form a unique fingerprint.

Cite this