1 / 26

EE3J2 Data Mining Lecture 15: Hidden Markov Models Martin Russell

EE3J2 Data Mining Lecture 15: Hidden Markov Models Martin Russell. Objectives. Hidden Markov models (HMMs) Viterbi decoding HMM training. Dynamic Programming Distance Calculation. Calculate ad ( S , Q ) for each sequence S in corpus. Sequence retrieval using DP.

ethan-goff
Download Presentation

EE3J2 Data Mining Lecture 15: Hidden Markov Models Martin Russell

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. EE3J2 Data MiningLecture 15: Hidden Markov ModelsMartin Russell EE3J2 Data Mining

  2. Objectives • Hidden Markov models (HMMs) • Viterbi decoding • HMM training EE3J2 Data Mining

  3. Dynamic Programming Distance Calculation Calculate ad(S,Q) for each sequence S in corpus Sequence retrieval using DP Corpus of sequential data ‘query’ sequence Q …… AAGDTDTDTDD AABBCBDAAAAAAA BABABABBCCDF GGGGDDGDGDGDGDTDTD DGDGDGDGD AABCDTAABCDTAABCDTAAB CDCDCDTGGG GGAACDTGGGGGAAA ……. ……. …BBCCDDDGDGDGDCDTCDTTDCCC… EE3J2 Data Mining

  4. Limitations of ‘template matching’ • This type of analysis is sometimes referred to as template matching • The ‘templates’ are the sequences in the corpus • Can think of each template as representing a ‘class’ • Problem is to determine which class best fits the query • Performance will depend on precisely which template is used to represent the class EE3J2 Data Mining

  5. Alternative path shapes • The basic units of path considered so far are: substitution insertion deletion • Others are possible and may have advantages, e.g: substitution insertion deletion EE3J2 Data Mining

  6. Example EE3J2 Data Mining

  7. Hidden Markov Models (HMMs) • One solution is to replace the individual template sequence with an ‘average’ sequence • But what is an ‘average sequence’? • One solution is to use a type of statistical model called a Hidden Markov Model EE3J2 Data Mining

  8. Y B B C A B C A B X C A B C A Z A B C HMMs • Suppose the following sequences are in same class: • ABC, YBBC, ABXC, AZ • Compute alignments: EE3J2 Data Mining

  9. Finite State Network Representation • The sequence consists of 3 ‘states’ • First state is ‘realised’ as A (twice) or Y (once) • Second state ‘realised’ as B (three times) or X (once) • Second state can be repeated or deleted • Third state can be ‘realised’ as C (twice) or Z (once) EE3J2 Data Mining

  10. Network representation • Directed graph representation • Each state associated with a set of probabilities • Called the ‘state emission’ probabilities EE3J2 Data Mining

  11. 0.5 0.67 1 0.5 1 0.33 Hidden Markov Model (HMM) Basic rule for drawing transition networks: Connect state j to state k if ajk > 0 ajk=Prob(state k follows state j) EE3J2 Data Mining

  12. Formal Definition • A Hidden Markov Model (HMM) for the symbols 1, 2, …, K consists of: • A number of states N • An N  N state transition probability matrix A • For each state k a set of probabilities pk(1), … , p(K) - p(k) is the probability that k occurs for state k EE3J2 Data Mining

  13. Y A B B B X B C A B C Alignment paths for HMMs • For HMMs, alignment paths are called state sequences State sequence EE3J2 Data Mining

  14. The optimal state sequence • Let M be a HMM and s a sequence • Probability on previous slide depends on the state sequence  and the model, so we write: • By analogy with dynamic programming, the optimal state sequence is the sequence such that: EE3J2 Data Mining

  15. Y A B B B X B C A B C Rule: connect state j at symbol m with state k at symbol m+1 if ajk > 0 Computing the optimal state sequence:The ‘state-symbol’ trellis EE3J2 Data Mining

  16. More examples EE3J2 Data Mining

  17. Dynamic Programminga.k.a Viterbi Decoding Y A B B B X B C A B C K k 4 EE3J2 Data Mining

  18. Viterbi Decoding Calculate p(Q|M) for each HMM M in corpus Sequence retrieval using HMMs Corpus of pre-build HMMs ‘query’ sequence Q …BBCCDDDGDGDGDCDTCDTTDCCC… EE3J2 Data Mining

  19. EE3J2 Data Mining

  20. EE3J2 Data Mining

  21. EE3J2 Data Mining

  22. EE3J2 Data Mining

  23. HMM Construction • Suppose we have a set of HMMs, each representing a different class (e.g. protein sequence) • Given an unknown sequence s: • Use Viterbi decoding to compare s with each HMM • Compute • But how do we obtain the HMM in the first place? EE3J2 Data Mining

  24. HMM training • Given a set of example sequences S a HMM M can be built such that p(S|M) is locally maximised • Procedure is as follows: • Obtain an initial estimate of a suitable model M0 • Apply an algorithm – the ‘Baum-Welch’ algorithm – to obtain a new model M1 such that p(S|M1) ≥ p(S|M0) • Repeat to produce a sequence of HMMs M0, M1,…,Mnwith: p(S|M0) ≤ p(S|M1) ≤ p(S|M2) ≤… ≤ p(S|Mn) EE3J2 Data Mining

  25. Local optimality Local maximum P(S|M) Global maximum M0 M1…Mn EE3J2 Data Mining

  26. Summary • Hidden Markov Models • Importance of HMMs for sequence matching • Viterbi decoding • HMM training EE3J2 Data Mining

More Related