Discriminative Classifiers vs Hidden Markov Models

CHAPTER 8 DISCRIMINATIVE CLASSIFIERSHIDDEN MARKOV MODELS

Generative vs. Discriminative

The Perceptron Model

Example: Spam

Binary Decision Rule

Online Perceptron Training

Perceptron Training Illustration

Properties of Perceptrons

Issues with Perceptrons

Reasoning over Time • Often, we want to reason about a sequence of observations  Speech recognition  Robot localization  User attention • Need to introduce time into our models • Basic approach: hidden Markov models (HMMs) • More general: dynamic Bayes’ nets

Markov Models

Conditional Independence

Weather Example

Mini-Forward Algorithm

Example

Stationary Distributions • If we simulate the chain long enough:  What happens?  Uncertainty accumulates  Eventually, we have no idea what the state is! • Stationary distributions:  For most chains, the distribution we end up in is independent of the initial distribution  Called the stationary distribution of the chain  Usually, can only predict a short time out

Example: Web Link Analysis

Mini-Viterbi Algorithm

Hidden Markov Models

Example

Conditional Independence

HMM Applications

Forward Algorithm

Viterbi Algorithm

Viterbi Example

Viterbi Properties • Designed for computing the most likely state hidden sequence given a sequence of observations in Hidden Markov Models • Two passes, forward to compute the forward probabilities, and then backward to reconstruct the maximum sequence • What’s the time complexity? • O(d2n) - Why is this exciting? • There are many extensions to the basic Viterbi algorithm which have been developed for other models which have similar local structure: syntactic parsing, for instance.

Speech in an Hour

HMMs for Speech

HMMs for Continuous Obs.? • Before: discrete, finite set of observations • Now: spectral feature vectors are real-valued! • Solution 1: discretization • Solution 2: continuous emissions models  Gaussians  Multivariate Gaussians  Mixtures of Multivariate Gaussians • A state is progressively:  Context independent subphone (~3 per phone)  Context dependent phone (=triphones)  State-tying of CD phone

ASR Lexicon: Markov Models

Viterbi with 2 Words + Unif. LM

Conclusion • Perceptron  A discriminative model, an alternative to generative models like Naïve Bayes  Simple classification rule, based on a weight vector  Simple online learning algorithm, guaranteed to converge if training set is separable • Hidden Markov Models  A special kind of Bayesian Network designed for reasoning about sequences of hidden states  Polynomial time inference for most likely state sequence (Viterbi) and marginalization (Forward- Backward)  Many applications

Discriminative Classifiers vs Hidden Markov Models

Discriminative Classifiers vs Hidden Markov Models

Presentation Transcript

Diamond Chapter 8 1 CHAPTER 8

CHAPTER 8

Chapter 8

CHAPTER 8

Chapter 8

Chapter 8

Chapter 8

Chapter 8

Chapter 8

Chapter 8

Chapter 8:

Chapter 8

Chapter 8

Chapter 8

Chapter 8

Chapter 8

Chapter 8

Chapter 8

Chapter 8

Chapter 8

Chapter 8

Chapter 8