80 likes | 169 Views
SPEECH RECOGNITION. Presented to Dr. V. Kepuska Presented by Lisa & Za ECE 5526. How does Sphinx3 work?. Sphinx3 uses ---HMM with continuous probability density function Flat initialization state :
E N D
SPEECH RECOGNITION Presented to Dr. V. Kepuska Presented by Lisa & Za ECE 5526
How does Sphinx3 work? • Sphinx3 uses ---HMM with continuous probability density function • Flat initialization state: • Mixture weights: the weights given to every Gaussian in the Gaussian mixture corresponding to a state • transition matrices: the matrix of state transition probabilities • means: means of all Gaussians • variances: variances of all Gaussians
How does Sphinx3 work? • forward-backward re-estimation algorithm (Baum-Welch algorithm) • Use for converging the likelihood training • Untied Modeling - Training for all context-dependent phones (usually triphones) that are seen in the training corpus
How does Sphinx3 work? • Building decision tree • Used to decide which of the HMM states of all the triphones (seen and unseen) are similar to each other • Pruning the decision trees
Our project:::Spelling Bees Use Sphinx3 to train the recorded data Compare the train data with the test data Result: We have used 224 train data and 73 test data. The dictionary has 46 words and 33 phones are used. 32.7% word error rate and 49.3% sentence error rate
The result::: id: (fash-cen2-fash-b) Scores: (#C #S #D #I) 3 0 0 0 REF: a m y HYP: a m y Speaker sentences 1: moe #utts: 8 id: (moe-m_oses1) Scores: (#C #S #D #I) 4 0 1 1 REF: * m o s e S HYP: E m o s e * Eval: I D id: (moe-m_oses2) Scores: (#C #S #D #I) 5 0 0 0 REF: m o s e s HYP: m o se s Eval:
Reference: http://www.speech.cs.cmu.edu/sphinxman/fr4.html Lecture notes from Speech recognition class http://www.ele.uri.edu/~hansenj/projects/ele585/ makeraw.m record.m