210 likes | 402 Views
Natural Language Processing. Spring 2007 V. “Juggy” Jagannathan. Course Book. Foundations of Statistical Natural Language Processing. By Christopher Manning & Hinrich Schutze. Chapter 9. Markov Models March 5, 2007. Markov models. Markov assumption
E N D
Natural Language Processing Spring 2007 V. “Juggy” Jagannathan
Course Book Foundations of Statistical Natural Language Processing By Christopher Manning & Hinrich Schutze
Chapter 9 Markov Models March 5, 2007
Markov models • Markov assumption • Suppose X = (X1, …, XT) is a sequence of random variables taking values in some finite set S = {s1,…,sN}, Markov properties are: • Limited Horizon • P(Xt+1 = sk|X1,…,Xt) = P(Xt+1 = sk|Xt) • i.e. the t+1 value only depends on t value • Time invariant (stationary) • Stochastic Transition matrix A: • aij = P(Xt+1 = sj|Xt=si) where
Hidden Markov Model Example Probability: {lem,ice-t} given the machine starts in CP? 0.3x0.7x0.1+0.3x0.3x0.7 =0.021+0.063 = 0.084
Why use HMMs? • Underlying events generating surface observable events • Eg. Predicting weather based on dampness of seaweeds • http://www.comp.leeds.ac.uk/roger/HiddenMarkovModels/html_dev/main.html • Linear Interpolation in n-gram models:
Look at Notes from David Meir Blei [UC Berkley] http://www-nlp.stanford.edu/fsnlp/hmm-chap/blei-hmm-ch9.ppt Slides 1-13
Initialization: Induction: Total computation: Forward Procedure
Initialization: Induction: Total computation: Backward Procedure
Finding the best state sequence To determine the state sequence that best explains observations Let: Individually the most likely state is: This approach, however, does not correctly estimate the most likely state sequence.
Finding the best state sequenceViterbi algorithm Store the most probable path that leads to a given node Initialization Induction Store Backtrace
Parameter Estimation Probability of traversing an arc at time t given observation sequence O: