Robust block-based clustering and identification of autoregressive speech parameters based on dynamic state tracking

Ruofei Chen, Cheung-Fat Chan

Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review

Abstract

In this paper, we propose two block-based clustering and identification algorithms that contribute to robust estimation of autoregressive (AR) speech parameters in noisy environments. Motivated by the fact that the evolution pattern of speech dynamics could be an observable feature that are retained in a series of noisy observations, a dynamic state tracking scheme based on Kalman filter is incorporated to utilize this additional trajectory information in block-based AR codebook design. The proposed algorithm is devised in a sense that AR blocks with similar clean line spectrum frequency trajectories as well as noisy-to-clean mappings are clustered offline and identified online. It is compared with conventional vector quantization based approaches that directly minimize a distortion between AR parameters. Through objective assessments based on mean square error and log-spectral distance, it is demonstrated that the proposed algorithm achieves significant improvement over conventional methods in various conditions. © 2012 IEEE.
Original languageEnglish
Title of host publicationICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Pages4469-4472
DOIs
Publication statusPublished - 2012
Event2012 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2012 - Kyoto, Japan
Duration: 25 Mar 201230 Mar 2012

Publication series

Name
ISSN (Print)1520-6149

Conference

Conference2012 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2012
PlaceJapan
CityKyoto
Period25/03/1230/03/12

Research Keywords

  • autoregressive model
  • clustering
  • Kalman filter
  • vector quantization

Fingerprint

Dive into the research topics of 'Robust block-based clustering and identification of autoregressive speech parameters based on dynamic state tracking'. Together they form a unique fingerprint.

Cite this