1 / 13

Ch 13. Sequential Data (1/2) Pattern Recognition and Machine Learning, C. M. Bishop, 2006.

Ch 13. Sequential Data (1/2) Pattern Recognition and Machine Learning, C. M. Bishop, 2006. Summarized by Kim Jin-young Biointelligence Laboratory, Seoul National University http://bi.snu.ac.kr/. Contents. 13.1 Markov Models 13.2 Hidden Markov Models 13.2.1 Maximum likelihood for the HMM

klisa
Download Presentation

Ch 13. Sequential Data (1/2) Pattern Recognition and Machine Learning, C. M. Bishop, 2006.

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Ch 13. Sequential Data (1/2)Pattern Recognition and Machine Learning, C. M. Bishop, 2006. Summarized by Kim Jin-young Biointelligence Laboratory, Seoul National University http://bi.snu.ac.kr/

  2. Contents • 13.1 Markov Models • 13.2 Hidden Markov Models • 13.2.1 Maximum likelihood for the HMM • 13.2.2 The forward-backward algorithm • 13.2.3 The sum-product algorithm for the HMM • 13.2.4 Scaling factors • 13.2.5 The Viterbi Algorithm • 13.2.6 Extensions of the HMM (C) 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

  3. Sequential Data • Data dependency exists according to a sequence • Weather data, DNA, characters in sentence • i.i.d. assumption doesn’t hold • Sequential Distribution • Stationary vs. Nonstationary • Markov Model • No latent variable • State Space Models • Hidden Markov Model (discrete latent variables) • Linear Dynamical Systems (C) 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

  4. Markov Models • Markov Chain • State Space Model (free of Markov assumption of any order with reasonable no. of extra parameters) (C) 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

  5. Hidden Markov Model (overview) • Overview • Introduction of discrete latent vars. (based on prior knowledge) • Examples • Coin toss • Urn and ball • Conditional Random Field • MRF globally conditioned by observation sequence X • CRF relaxes independence assumption by HMM (C) 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

  6. Hidden Markov Model (example) • Lattice Representation • Left-to-right HMM <Handwriting Recognition> (C) 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

  7. Hidden Markov Model • Given the following, • Joint prob. dist. for HMM is: • Whose elements are: (observation,latent var,model parameters) K : 상태의 수 / N : 총 시간 Zn-1j,nk : 시각 n-1에서 j상태였다가 시각 n에서 k상태로 transition (initial latent node) (cond. dist. among latent vars) (emission prob.) (C) 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

  8. EM Revisited (slide by Ho-sik Seok) • General EM • Maximizing the log likelihood function • Given a joint distribution p(X, Z|Θ) over observed variables X and latent variables Z, governed by parameters Θ • Choose an initial setting for the parameters Θold • E step Evaluate p(Z|X,Θold ) • M step Evaluate Θnew given by Θnew = argmaxΘQ(Θ ,Θold) Q(Θ ,Θold) = ΣZ p(Z|X, Θold)ln p(X, Z| Θ) • It the covariance criterion is not satisfied, then let Θold Θnew (C) 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

  9. Estimation of HMM Parameter (using M.L.) • The Likelihood Function • Using EM Algorithm • E-Step (marginalization over latent vars Z) M-Step (C) 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

  10. Forward-backward Algorithm (probability of observation) • Probability for a single latent variable • Defining alpha & beta Recursively • Used for evaluating the prob. Of observation (?) (C) 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

  11. Sum-product Algorithm (probability of observation) • Factor graph representation • Same result as before (C) 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

  12. The Viterbi Algorithm (most likely state sequence) • From max-sum algorithm • Joint dist. by the most probable path (C) 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

  13. References • HMM • A Tutorial On Hidden Markov Models And Selected Applications In Speech Recognition (Rabiner) • CRF Introduction • http://www.inference.phy.cam.ac.uk/hmw26/papers/crf_intro.pdf (C) 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

More Related