270 likes | 396 Views
Viterbi Sequences and Polytopes. Eric Kuo Dept. Computer Science UC Berkeley. Problem. Given a sequence of n states from a Markov chain For which Markov chains would that sequence be the path of highest probability?. Viterbi Path.
E N D
Viterbi Sequences and Polytopes Eric Kuo Dept. Computer Science UC Berkeley
Problem • Given a sequence of n states from a Markov chain • For which Markov chains would that sequence be the path of highest probability?
Viterbi Path • The Viterbi path of length n of a Markov chain is a path of n transitions with the greatest probability • Path S is a Viterbi path iff for all other paths T of length n, Pr[S] \geq Pr[T]
Viterbi Sequences • Two paths are equivalent if they have an equal set of transitions (e.g. 0100, 0010) • A path P is a Viterbi sequence if there exists a Markov chain whose Viterbi paths are either P or equivalent to P
Why is 0011 not a Viterbi Sequence? • If p00>p11 then Pr[0001]>Pr[0011] • If p00<p11 then Pr[0111]>Pr[0011] • 0011 is a Viterbi path iff p00=p11 , so then Pr[0111]=Pr[0011]=Pr[0001].
Agenda • Properties and Structure of Viterbi Sequences • Bound, periodicity of number of Viterbi Sequences • Partitioning the Space of Markov Chains according to their Viterbi sequences • Viterbi Polytopes
Viterbi Sequence Properties • Within a Viterbi sequence: • The subsequences of length n beginning with state A, ending with state B are all equivalent • If A…A and B…B are two subsequences of equal length, then those subsequences are equivalent
Viterbi Sequence Structure • Each Viterbi sequence can be rearranged so that it can be divided into three parts: • Prefix • Periodic interior • Suffix • 02111111120 • 01212121120
Periodic Interior • If S is a Viterbi sequence of length k(mk+1), then S can be rearranged to have a periodic interior that lasts for at least m+1 periods.
Prefixes & Suffixes • If a Viterbi sequence has an arbitarily long uninterrupted periodic section with period p, then the prefix has at most kp transitions, and the suffix has at most kp transitions • However, their combined lengths may not exceed kp+k–2p
Bound on Viterbi Sequences • The number of Viterbi sequences of length n is bounded, even as n tends to infinity • Bound:
Periodicity of number of Viterbi Sequences • Let Vk(n) be the number of k-state Viterbi sequences of length n • Let M=LCM(1,2,...,k-1,k) • When n>M+k2, Vk(n+M) Vk(n) • The sequence {Vk(n)} (k fixed, n>0) is eventually periodic with period M
Proof that Vk(n+M) Vk(n) • Let S be a length n Viterbi sequence, and PS be a length M subsequence of the interior of S • Pr[S]>Pr[T] and Pr[PS] Pr[PT] • If we extend the interiors of S and T by length M to form S’ and T’, then Pr[S’]>Pr[T’] • S’ is a Viterbi sequence of length n+M
Space of k-state Markov Chains • Each dimension represents transition probability (or initial distribution) • Two Markov chains are in the same region if they have the same set of Viterbi paths • Regions with positive measure correspond to Viterbi sequences
Subspaces of 2-state MCsodd number of transitions p10 p10 010100 101010 010101 000000 100000 110101 011111 111111 (0,0) (0,0) p00 p00 Initial state=0 Initial state=1
Linearized Space • Each dimension represents logarithm of probability (or weight) of transition • Remove condition that probabilities sum to 1 • Minimum weight sequences are Viterbi sequences • Introduces new Viterbi sequences such as 001
Viterbi Polytopes • Each coordinate represents number of transitions of type 00, 01, 10, etc. • 011100101 corresponds to (1,3,2,2) • Plot each point representing a sequence of length n, and take convex hull to form the polytope • Consider only sequences beginning with state 0
Viterbi Polytopes • Vertices of polytope represent the Viterbi sequences • The polytope is the dual to the linearized space of Markov chains • If two vertices share an edge in the polytope, then the corresponding Viterbi regions share a boundary
2-State Viterbi Polytope 011111 011110 000000 000001 010110 010100 010101
3-State Viterbi Polytopes • Eight-dimensional polytope • Eight possible periods: 00000, 11111, 22222, 01010, 02020, 121212, 0120120, 0210210 • Six different polytope structures arise, one for each sequence length modulo 6
Four-state Viterbi Polytopes • 1789 vertices, if 3 divides n; otherwise 1777 vertices • Enumerated with polymake • Needed to use fact that the sequences that end with same state form a face of the polytope
Open Topics • Viterbi cycles (instead of paths) • Tighter bound for number of k-state Viterbi sequences of length n • More efficient algorithms for enumerating Viterbi sequences, or determining whether a given sequence can be a Viterbi sequence
00000000 01010100 01010101 01111111 11111111 10101011 10101010 10000000 000000000 010101010 010101011 011111111 111111111 101010101 101010100 100000000 2-State Viterbi Sequences
Space of 2-State Markov Chains • Sequences with odd number of transitions p0 p10 p00
Subspaces of 2-state MCseven number of transitions p10 p10 10100 01010 10101 00000 10000 01011 01111 11111 (0,0) (0,0) p00 p00 Initial state=0 Initial state=1
Space of 2-State Markov Chains • For sequences with even number of transitions p0 p10 p00