100 likes | 213 Views
PURE Research Symposium Spring 2009. VOICE RECOGNITION USING AN HMM BASED DESIGN. Richard Muryanto and Nicholas Corso Mentored by: Sun Yu. Introduction. In engineering applications voice recognition systems has many diverse uses.
E N D
PURE Research Symposium Spring 2009 VOICE RECOGNITION USING AN HMM BASED DESIGN Richard Muryanto and Nicholas Corso Mentored by: Sun Yu
Introduction • In engineering applications voice recognition systems has many diverse uses. • Many schemes exist to implement voice recognition systems(DTW,HMM...) • In this educational project we used Hidden Markov Models to implement a real-time voice recognition system.
System Overview http://labrosa.ee.columbia.edu/doc/HTKBook21/img15.gif
Hidden Markov Models • Hidden Markov Models (HMM) are a way of modeling probabilities involving states of systems that can not directly observed. • HMMs can be characterized in terms of a few key parameters. http://www.info.ucl.ac.be/Research/Areas/Images/RT-Pict-HMM.png
Hidden Markov Models: Cont. • Classically there are three main algorithms associated to HMMs • evaluation\decoding\learning • For an HMM based voice recognition system the Baum-Welch Algorithm is pivotal to the training of the system.
System Implementation Pre-recorded Data VAD MFCC Feature Extraction Training HMM (Baum-Welch) Recorded Data VAD MFCC Compute Likelihood Display Output ML Decision
VAD and MFCC • Voice Activity Detection (VAD) determines which parts of a voice signal are actual data and which are silence. • The VAD algorithm used here utilizes the short-time energy, and zero crossing rate to decide if there is voice activity. • Mel-Frequency Cepstral Coefficients (MFCC) was used to extract characteristic information from the speech vectors.
Observed Data Ability for the System to Recognize Training Data
Possible Extensions • With more time the effects of environment on recognition rate could be investigated. • With further investigation the effects of parameters in the Baum-Welch Algorithm could explored. • A larger word set could be implemented.
References • Ramírez, J.; J. M. Górriz, J. C. Segura (2007). "Voice Activity Detection. Fundamentals and Speech Recognition System Robustness". in M. Grimm and K. Kroschel. Robust Speech Recognition and Understanding. pp. 1–22. • Rabiner, Lawrence R. “A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition” in Proceedings of the IEEE. V.77, No.2, February 1989. • Taoran Lu, Chao Zhang, Dan Zhu "Recognition by HMM"