1 / 23

Class Hidden Markov Models

Class Hidden Markov Models. Markov Chains. First order Markov process set of states, S i transition probabilities, a ij the probability of each state is dependent only on the previous state trivial constraints (transitions must be true probabilities). a AC. A. C. a AA. a AT. a AG. G.

svein
Download Presentation

Class Hidden Markov Models

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Class Hidden Markov Models

  2. Markov Chains • First order Markov process • set of states, Si • transition probabilities, aij • the probability of each state is dependent only on the previous state • trivial constraints (transitions must be true probabilities) aAC A C aAA aAT aAG G T

  3. Markov Chains • Probabilities • for a sequence of observed states O = ( s0, s1, …sT) in general for the Markov chain

  4. Markov Chains • What is it good for? • Model comparison – for a given model (set of transition probabilities), what is the probability of seeing the observed data • The best model (maximum likelihood model) can be found by simply counting the observed transitions in a sufficiently large set of data • Generated random sequence according to specific background models • Birth /death processes – probability of extinction • Branching processes (trees) • Markov chain Monte Carlo (MCMC)

  5. Hidden Markov Model • More complicated model • let each state emit a character, ck, according to a set of emission probabilities, bik, where bik = P( ck|Si) • For a set of observed characters O = (o0, …, oT) and states (so, ..., sT)

  6. Hidden Markov Model • Two state model for sequences • maybe S0 is exon and S1 is intron • what if you have a set of observed characters, but you want to know what the state is (or the most likely state) • the state information is the hidden part of the hidden Markov model S0 P(A) = 0.148 P(C) = 0.334 P(G) = 0.365 P(T) = 0.154 S1 P(A) = 0.262 P(C) = 0.247 P(G) = 0.238 P(T) = 0.253 CGCTTAGCTATCGCATTCGGCTACGATCTAGCTACGTAGCTATGCCGATGCATTATTACGCGATCTCGATCG S0 S0 S0 S1 S1 S1 C G C T T A

  7. Hidden Markov Model • Rabiner's three basic problems • what is the probability of an observed sequence, O, given a model?(evaluation) • joint probability of observations and state sequence – P(O,S|θ) • as useful as Markov Chain • what is the optimal sequence of states that "explains" the observed data? (decoding) • optimality criterion? • how can one adjust the model parameters to maximize the probability of the observed data given the model(learning)

  8. Hidden Markov Model • Evaluation • for a model with parameters , what is or more simply • model parametersQ a set of states A the state transition matrix B the emission probabilitiesω an initial probability distribution • observations

  9. Hidden Markov Model • Evaluation • Assume the observations are independentThe probability of a a particular state sequence (or path), π, • There are NT state sequences and O(T) calculations so the brute force complexity is O(TNT)

  10. Hidden Markov Model • Forward algorithm • is the probability of observing the partial sequence given that state with • complexity O(N2T)

  11. Hidden Markov Model • Backward algorithm • almost the same as forward algorithm • is the probability of observing the partial sequence given that state with initial condition

  12. Hidden Markov Model • Problem 2: Decoding • what is the optimal sequence of states that "explains" the observed data? • optimality criteria • the path π that maximizes the correct number of individual states, i.e., the path where the states are individually most likely • the most probable single path, maximize or equivalently Viterbi algorithm • The optimal path is the path that maximizes let be the highest probability path ending in state i • keep track of argument that maximizes at each position t

  13. Hidden Markov Model • Problem 3: learning • find the parameters, , that maximizes • No analytical solution requires iterative solution (Baum-Welch algorithm) • initial model , repeat • compute parameters based on and observations O • if , stop • else, accept , goto 2. • With Baum-Welch algorithm likelihood is proven to be greater or equal at each step

  14. Hidden Markov Model • Training • need to update the transition probabilities. the probability of being in state i at time t and state j at time t+1 is • the probability of being in state i at time t, given the observed sequence O is or in terms of ,

  15. Hidden Markov Model • Training • Derived quantitiesexpected number of times state i is used expected number of transitions from state ito state j • BW parameter updates • probability of starting in state I • where

  16. Class Hidden Markov Model • Why? • Basic HMM has no way to include the known states of classified training data into the optimization! Generally trained in an unsupervised learning approach. • HMMs are not discriminitive models • How do you discriminate? • Separate training data into classes and train • for any set of observations, compare probabilities of observed data given each model and choose the best model (likelihood ratio, for example)

  17. Class Hidden Markov Model • Hidden Markov Models for Labeled SequencesAnders Krogh, ISMB 1994 • Assume you have a sequence of observed symbols, (s0, s1, …, sT), and a sequence of (observed) classes (c0, c1, …, cT) • Each state at time t, emits an observed symbol and an observed class (label). The classes are treated similarly to the emission probabilitiesprobability of emitting characteraand class label x • Most often a state will emit a single class, i.e., for two states x, and y, = 1 and = 0 = 0 and = 1 • What is the probability of the class labels given this model • is the basic HMM calculation described earlier and solved using the forward/backward algorithm

  18. Class Hidden Markov Model • in the general case, multiple paths through the model can give the same labeling. in the special case where each state emits only a single class label, you can use Viterbi to optimize the most probable class labeling of sequence s • Maximum likelihood parameter estimates

  19. Class Hidden Markov Model

  20. Class Hidden Markov Model • Calculating gradient -

  21. Class Hidden Markov Model • Calculating gradient - • basically the same as • the overall likelihood and optimization

  22. Class Hidden Markov Model • Overall gradient of the likelihood • Intuitively • Permissible paths - the state path given in φ • mk is the number of times θk is used in permissible paths • Slightly modified forward/backward • αt and βt are zero except along the permissible paths • nk is the number of times θk is used in all possible • Use standard forward/backward algorithm • Straightforward application of EM (Baum-Welch) is likely to give negative probabilities (eq. 16) • Krogh proposes an iterative method, including multiple training sequences, μ

  23. References • To learn about HMMs • Rabiner L. A tutorial on hidden Markov models and selected applications in speech recognition. Proc. IEEE 77, 257-286, 1989. • This is the ultimate origin, everything comes back to here including basic nomenclature and definition of problems • Durbin R, Eddy S, Krogh A, Mitchison G. Biological Sequence analysis, probabilistic models of proteins and nucleic acids. Cambridge University Press, 1998. • Detailed applications to sequence alignment, phylogenetic trees, RNA folding, and other biological models • Mann T. Numerically stable hidden Markov model implementation. http://bozeman.genome.washington.edu/compbio/mbt599_2006/hmm_scaling_revised.pdf. 2006. • Detailed pseudocode for implementation of HMM calculation in log space to avoid numerical underflow. Very useful if you are writing your own code. • De Fonzo V, Aluffi-Pentini F, Parisi V. Hidden Markov Models in Bioinformatics. Current Bioinformatics 2, 49-61, 2007. • A more recent review that gives a clear outline of the algorithms and lots of references to more recent applications in computational biology. • Kanungo T. Hidden Markov Models. http://www.kanungo.com/software/hmmtut.pdf. • Kanungo T. UMDHMM: hidden Markov model toolkit. in "Extended finite state models of language", Kornai A (ed). Cambridge University Press. Software download http://www.kanungo.com/software/software.html#umdhmm 1999.

More Related