1 / 19

Natural Language Processing

Natural Language Processing. Spring 2007 V. “Juggy” Jagannathan. Course Book. Foundations of Statistical Natural Language Processing. By Christopher Manning & Hinrich Schutze. Chapter 9. Markov Models March 5, 2007. Markov models. Markov assumption

charis
Download Presentation

Natural Language Processing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Natural Language Processing Spring 2007 V. “Juggy” Jagannathan

  2. Course Book Foundations of Statistical Natural Language Processing By Christopher Manning & Hinrich Schutze

  3. Chapter 9 Markov Models March 5, 2007

  4. Markov models • Markov assumption • Suppose X = (X1, …, XT) is a sequence of random variables taking values in some finite set S = {s1,…,sN}, Markov properties are: • Limited Horizon • P(Xt+1 = sk|X1,…,Xt) = P(Xt+1 = sk|Xt) • i.e. the t+1 value only depends on t value • Time invariant (stationary) • Stochastic Transition matrix A: • aij = P(Xt+1 = sj|Xt=si) where

  5. Markov model example

  6. Hidden Markov Model Example Probability: {lem,ice-t} given the machine starts in CP? 0.3x0.7x0.1+0.3x0.3x0.7 =0.021+0.063 = 0.084

  7. Why use HMMs? • Underlying events  generating surface observable events • Eg. Predicting weather based on dampness of seaweeds • http://www.comp.leeds.ac.uk/roger/HiddenMarkovModels/html_dev/main.html • Linear Interpolation in n-gram models:

  8. Look at Notes from David Meir Blei [UC Berkley] http://www-nlp.stanford.edu/fsnlp/hmm-chap/blei-hmm-ch9.ppt Slides 1-13

  9. (Observed states)

  10. Forward Procedure

  11. Initialization: Induction: Total computation: Forward Procedure

  12. Initialization: Induction: Total computation: Backward Procedure

  13. Combining both – forward and backward

  14. Finding the best state sequence To determine the state sequence that best explains observations Let: Individually the most likely state is: This approach, however, does not correctly estimate the most likely state sequence.

  15. Finding the best state sequenceViterbi algorithm Store the most probable path that leads to a given node Initialization Induction Store Backtrace

  16. Parameter Estimation

  17. Parameter Estimation Probability of traversing an arc at time t given observation sequence O:

  18. Parameter Estimation

More Related