Abstract
It's well known that visual information such as lip shape and its movement can indicate what the speaker is talking about. In this paper, we present an automatic lipreading system solely using visual information for recognizing isolated English digits, from 0 to 9. A parameter set of a 14-point ASM lip model is used to describe the outer lip contour. The inner mouth information such as the teeth region and the mouth opening are also extracted. With appropriate normalization, the feature vectors containing the normalized outer lip features, inner mouth features and also their first order derivatives are obtained for training the HMM word models. Experiments have been carried out to investigate the recognition performance using our visual feature set compared with other traditional visual feature representations. An accuracy of 93% for speaker dependent recognition and 84% for speaker independent recognition is achieved using our visual feature representation. A real-time automatic lipreading system has been successfully implemented on a 1.9-GHz PC.
| Original language | English |
|---|---|
| Journal | Proceedings - IEEE International Symposium on Circuits and Systems |
| Volume | 2 |
| DOIs | |
| Publication status | Published - 2004 |
| Event | 2004 IEEE International Symposium on Cirquits and Systems - Proceedings - Vancouver, BC, Canada Duration: 23 May 2004 → 26 May 2004 |
Fingerprint
Dive into the research topics of 'A real-time automatic lipreading system'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver