1 / 23

Learning HMM parameters

Learning HMM parameters. Sushmita Roy BMI/CS 576 www.biostat.wisc.edu/bmi576/ sroy@biostat.wisc.edu Oct 31 st , 2013. Recall the three questions in HMMs. Given a sequence of observations how likely is it an HMM to have generated it? Forward algorithm

sanam
Download Presentation

Learning HMM parameters

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Learning HMM parameters Sushmita Roy BMI/CS 576 www.biostat.wisc.edu/bmi576/ sroy@biostat.wisc.edu Oct 31st, 2013

  2. Recall the three questions in HMMs • Given a sequence of observations how likely is it an HMM to have generated it? • Forward algorithm • What is the most likely sequence of states that has generated a sequence of observations • Viterbi • How can we learn an HMM from a set of sequences? • Forward-backward or Baum-Welch (an EM algorithm)

  3. Learning HMMs from data • Parameter estimation • If we knew the state sequence it would be easy to estimate the parameters • But we need to work with hidden state sequences • Use “expected” counts of state transitions

  4. begin end Learning without hidden information • Learning is simple if we know the correct path for each sequence in our training set 0 2 2 4 4 5 C A G T 1 3 0 5 2 4 • Estimate parameters by counting the number of times each parameter is used across the training set

  5. Learning without hidden information • Transition probabilities • Emission probabilities Number of transitions from k to l k,l are states Number of times b is emitted from k

  6. begin end Learning with hidden information • if we don’t know the correct path for each sequence in our training set, consider all possible paths for the sequence ? ? ? ? 0 5 C A G T 1 3 0 5 2 4 • estimate parameters through a procedure that counts the expected number of times each parameter is used across the training set

  7. The Baum-Welch algorithm • Also known as Forward-backward algorithm • An Expectation Maximization algorithm • Expectation: Estimate the “expected” number of times there are transitions and emissions (using current values of parameters) • Maximization: Estimate parameters given hidden variables • Hidden variables are the state transitions and emission counts

  8. The expectation step • We need to know the probability of the ith symbol being produced by state k, given sequencex(posterior probability of state k at time t) • We also need to know the probability of ith and (i+1)th symbol being produced by state k, and lgiven sequencex • Given these we can compute our expected counts for state transitions, character emissions

  9. Computing • We will do this in a somewhat indirect manner • First we compute the probability of the entire observed sequence with the tthsymbol being generated by state k Forward algorithm fk(t) Backward algorithm bk(t)

  10. Computing • If we can compute • How can we get Forward step

  11. The backward algorithm • the backward algorithm gives us , the probability of observing the rest of x, given that we’re in state kafter icharacters 0.4 0.2 A 0.4 C 0.1 G 0.2 T 0.3 A 0.2 C 0.3 G 0.3 T 0.2 0.8 0.6 0.5 1 3 begin end 0 5 A 0.4 C 0.1 G 0.1 T 0.4 A 0.1 C 0.4 G 0.4 T 0.1 0.5 0.9 0.2 2 4 0.1 0.8 C A G T

  12. Steps of the backward algorithm • Initialization (t=T) • Recursion (t=T-1 to 1) • Termination

  13. Computing • This is

  14. Putting it all together • We need the expected number of times c is emitted by state k • And the expected number of times k transitions to l Training sequences

  15. The maximization step • Estimate new emission parameters by: • Estimate new transition parameters by • Just like in the simple case but typically we’ll do some “smoothing” (e.g. add pseudocounts)

  16. The Baum-Welch algorithm • initialize the parameters of the HMM • iterate until convergence • initialize , with pseudocounts • E-step: for each training set sequence j= 1…n • calculate values for sequence j • calculate values for sequence j • add the contribution of sequence j to , • M-step: update the HMM parameters using ,

  17. begin end A 0.4 C 0.1 G 0.1 T 0.4 A 0.1 C 0.4 G 0.4 T 0.1 1.0 0.2 0.9 0 3 1 2 0.1 0.8 Baum-Welch algorithm example • given • the HMM with the parameters initialized as shown • the training sequences TAG, ACG • we’ll work through one iteration of Baum-Welch

  18. Baum-Welch example (cont) • Determining the forward values for TAG • Here we compute just the values that are needed for computing successive values. • For example, no point in calculating f1(3) • In a similar way, we also compute forward values forACG

  19. Baum-Welch example (cont) • Determining the backward values for TAG • Again, here we compute just the values that are needed • In a similar way, we also compute backward values for ACG

  20. Baum-Welch example (cont) • determining the expected emission counts for state 1 contribution of TAG contribution of ACG pseudocount *note that the forward/backward values in these two columns differ; in each column they are computed for the sequence associated with the column

  21. Baum-Welch example (cont) • determining the expected transition counts for state 1 (not using pseudocounts) • in a similar way, we also determine the expected emission/transition counts for state 2 Contribution of TAG Contribution of ACG

  22. Baum-Welch example (cont) • determining probabilities for state 1

  23. Summary • Three problems in HMMs • Probability of an observed sequence • Forward algorithm • Most likely path for an observed sequence • Viterbi • Can be used for segmentation of observed sequence • Parameter estimation • Baum-Welch • The backward algorithm is used to compute a quantity needed to estimate the posterior of a state given the entire observed sequence

More Related