150 likes | 364 Views
Algebraic Statistics for Computational Biology Lior Pachter and Bernd Sturmfels Ch.5: Parametric Inference R. Mihaescu. Παρουσίαση: A γγελίνα Βιδάλη Αλγεβρικοί & Γεωμετρικοί Αλγόριθμοι στη Μοριακή Βιολογία Διδάσκων: Ι. Εμίρης. Tropical arithmetic. The polytope agebra ( P d.
E N D
Algebraic Statistics for Computational BiologyLior Pachter and Bernd SturmfelsCh.5:Parametric InferenceR. Mihaescu Παρουσίαση: Aγγελίνα Βιδάλη Αλγεβρικοί & Γεωμετρικοί Αλγόριθμοι στη Μοριακή Βιολογία Διδάσκων: Ι. Εμίρης
Tropical arithmetic The polytope agebra (Pd natural higher-dimensional generalization: • (Convex hull) • (Minkowski sum) Convenient algebraic structure for stating dynamic programming algorithms: the tropical semiring
Inference Observation:σ1,…,σn :Known biological data FromObserved random variablesY1 =σ1,…,Yn= σn we want to infer values for the Hidden random variablesΧ1,…,Χm: Unknown biological data, i.e.: • How do two sequences allign? MAP estimation: given an observation σ1,…,σn which is the most probable explanation X1 =h1,…, Χm =hm ? Model parameters give transition probabilities phσ : hidden statehσobserved state
We can efficiently compute the marginal probabilities: Observation:σ1,…,σn Hidden Markov Model (HMM) We want to compute anexplanation for the observation: the sequenceh1,…,hmwhich yields the maximum a prosteriori probability(MAP):
pσ has the decomposition which gives the“Forward algorithm”. Markov chain: Independent probabilities Computation of the marginal probabilities:
We can now efficiently find an explanation h1,…,hm for the observation σ1,…,σnusing the recursion: Viterbi algorithm problem of computing pσ Tropicalization:uij=-log(p’ij) vij=-log(pij) It is again the Forward algorithm.
Pair Hidden Markov Model (pHMM) The algebraic statisticalmodel for sequence alignment, known as the pair hidden Markov model, is the image of the map where An,m is the set of all alignments of the sequences σ1, σ2.
c g t Example: n=5, m=4 g gttta- gt--gc ** g t t t a • The Needleman-Wunsch algorithm for finding the shortest path in the alignment graph is the tropicalization of the pair hidden Markov model for sequence allignment.
The polytope propagation algorithm • Tropical sum-product algorithm in general fashion. f is the density function for a statistical model. From the dmonomials find the one that maximizes Solution: • Tropicalization: wi=-logpi& • Computation in the ploytope algebra
Tropicalization: wi=-logpi Density function for a statistical model: f(p1,p2)=p13+p12p22+p1p22+p1+p24 • Find the index j of the monomial with maximal value • Find an explanation • Find the index j of the monomial that minimizes the function ej .w.
p11 p13 Explanations arevertices of the Newton Polytope of f f(p1,p2)=p13+p12p22+p1p22+p1+p24 we find a point for each exponent vector of a monomial
Normal fan • The normal fanpartitionsthe parameterspace into regions such that: the explanation(s) for all sets of parameters in a given region is givenby the polytope vertex(face) associated to that region.
Parametric MAP estimation problem • Local: given a choice of parameters determine the set of all parameters with the same MAP estimate. • Solution: Computation of the normal cone of the Newton Polytope. • Global:asks for a partition of the space ofparameters such that any two parameters lie in the same partiffthey yield the same MAP estimate. • Solution: Computation of the normal fan of the Newton Polytope.