1 / 15

EEL 6586: AUTOMATIC SPEECH PROCESSING Hidden Markov Model Lecture

EEL 6586: AUTOMATIC SPEECH PROCESSING Hidden Markov Model Lecture. Mark D. Skowronski Computational Neuro-Engineering Lab University of Florida March 31, 2003. Questions to be Answered. What is a Hidden Markov Model? How do HMMs work? How are HMMs applied to automatic speech recognition?

draco
Download Presentation

EEL 6586: AUTOMATIC SPEECH PROCESSING Hidden Markov Model Lecture

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. EEL 6586: AUTOMATIC SPEECH PROCESSING Hidden Markov Model Lecture Mark D. Skowronski Computational Neuro-Engineering Lab University of Florida March 31, 2003

  2. Questions to be Answered • What is a Hidden Markov Model? • How do HMMs work? • How are HMMs applied to automatic speech recognition? • What are the strengths/weaknesses of HMMs?

  3. What is an HMM? A Hidden Markov Model is a piecewise stationary model of a nonstationary signal. • Model parameters: • states -- represents domain of a stationary signal • interstate connections -- defines model architecture • pdf estimates (for each state) • Discrete -- codebooks • Continuous -- mean, covariance matrices

  4. HMM Depiction

  5. PDF Estimation • Discrete • Codebook of feature space cluster centers • Probability for each codebook entry • Continuous • Gaussian mixtures (mean, covariance, mixture weights) • Discriminative estimates (neural networks)

  6. How do HMMs Work? • Three fundamental issues • Training: Baum-Welch algorithm • Scoring (evaluation): Forward algorithm • Optimal path: Viterbi algorithm Complete implementation details: “A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition”, L. R. Rabiner, IEEE Proceedings, Feb 1989

  7. HMM Training • Baum-Welch algorithm • Iterative procedure (on-line or batch mode) • Guaranteed to increase model accuracy after each iteration • Estimation may be model-based (ML) or discriminative (MMI)

  8. HMM Evaluation • Forward algorithm • Calculates P(O|λ) for ALL valid state sequences • Complexity: • order N2T, ~5000 computations • order 2T•NT (brute force), ~6E86 computations • N states, T speech frames

  9. Optimal Path • Viterbi algorithm • Determines the single most-likely state sequence for a given model and observation sequence • Dynamic programming solution • Likelihood of Viterbi path can be used for evaluation instead of Forward algorithm

  10. HMMs in ASR Piecewise stationary model of nonstationary signal TRADEOFF

  11. Typical Implementations • Word models: • 39 dimension feature vectors • 3-15 states • 1-50 Gaussian mixtures • Diagonal covariance matrices • First-order HMM • Single-step state transitions • Viterbi used for evaluation (speed)

  12. Typical Implementations • Triphones • Left- and right-context phoneme • 3-5 states • Up to 50 mixtures/state • 40K models • 39 dimension full covariance matrices • Approx 15 billion parameters to estimate • Approx 43,000 hours speech for training

  13. Implementation Issues • Same number of states for each word model? • Underflow of evaluation probabilities? • Full/Diagonal covariance matrices?

  14. HMM Limitations • Piecewise stationary assumption • Dipthongs • Tonal languages • Phonetic information in transitions • iid assumption • Slow articulators • Temporal information • No modeling beyond 100 ms time frame • Data intensive

  15. Download Slides www.cnel.ufl.edu/~markskow/papers/hmm.ppt

More Related