TY - GEN
T1 - Experimental study on GMM-based speaker recognition
AU - Ye, Wenxing
AU - Wu, Dapeng
AU - Nucci, Antonio
N1 - Publication details (e.g. title, author(s), publication statuses and dates) are captured on an “AS IS” and “AS AVAILABLE” basis at the time of record harvesting from the data source. Suggestions for further amendments or supplementary information can be sent to [email protected].
PY - 2010
Y1 - 2010
N2 - Speaker recognition plays a very important role in the field of biometric security. In order to improve the recognition performance, many pattern recognition techniques have be explored in the literature. Among these techniques, the Gaussian Mixture Model (GMM) is proved to be an effective statistic model for speaker recognition and is used in most state-of-the-art speaker recognition systems. The GMM is used to represent the 'voice print' of a speaker through modeling the spectral characteristic of speech signals of the speaker. In this paper, we implement a speaker recognition system, which consists of preprocessing, Mel-Frequency Cepstrum Coefficients (MFCCs) based feature extraction, and GMM based classification. We test our system with TIDIGITS data set (325 speakers) and our own recordings of more than 200 speakers; our system achieves 100% correct recognition rate. Moreover, we also test our system under the scenario that training samples are from one language but test samples are from a different language; our system also achieves 100% correct recognition rate, which indicates that our system is language independent. © 2010 Copyright SPIE - The International Society for Optical Engineering.
AB - Speaker recognition plays a very important role in the field of biometric security. In order to improve the recognition performance, many pattern recognition techniques have be explored in the literature. Among these techniques, the Gaussian Mixture Model (GMM) is proved to be an effective statistic model for speaker recognition and is used in most state-of-the-art speaker recognition systems. The GMM is used to represent the 'voice print' of a speaker through modeling the spectral characteristic of speech signals of the speaker. In this paper, we implement a speaker recognition system, which consists of preprocessing, Mel-Frequency Cepstrum Coefficients (MFCCs) based feature extraction, and GMM based classification. We test our system with TIDIGITS data set (325 speakers) and our own recordings of more than 200 speakers; our system achieves 100% correct recognition rate. Moreover, we also test our system under the scenario that training samples are from one language but test samples are from a different language; our system also achieves 100% correct recognition rate, which indicates that our system is language independent. © 2010 Copyright SPIE - The International Society for Optical Engineering.
KW - GMM
KW - MFCC
KW - Speaker recognition
UR - http://www.scopus.com/inward/record.url?scp=79953695976&partnerID=8YFLogxK
UR - https://www.scopus.com/record/pubmetrics.uri?eid=2-s2.0-79953695976&origin=recordpage
U2 - 10.1117/12.849201
DO - 10.1117/12.849201
M3 - RGC 32 - Refereed conference paper (with host publication)
SN - 9780819481726
VL - 7708
T3 - Proceedings of SPIE - The International Society for Optical Engineering
BT - Mobile Multimedia/Image Processing, Security, and Applications 2010
T2 - Mobile Multimedia/Image Processing, Security, and Applications 2010
Y2 - 5 April 2010 through 6 April 2010
ER -