490 likes | 595 Views
Probabilistic Approaches to Phylogeny. Wouter Van Gool & Thomas Jellema. Probabilistic Approaches to Phylogeny. Contents Introduction/Overview Wouter Probabilistic Models of Evolution Wouter Calculating the Likelihood Wouter
E N D
Probabilistic Approaches to Phylogeny Wouter Van Gool & Thomas Jellema
Probabilistic Approaches to Phylogeny Contents • Introduction/Overview Wouter • Probabilistic Models of Evolution Wouter • Calculating the Likelihood Wouter • Pause • Evolution Demo Thomas • Using the likelihood for inference Thomas • Phylogeny Demo Thomas • Summary/Conclusion Thomas • Questions
8.1 Introduction Goal: • Formulate probabilistic models for phylogeny • Infer trees from sets of sequences Aim Probability-based Phylogeny: Rank trees according to - likelihood P(data |tree) - posterior probability P(tree|data)
8.1 Introduction Compute probability of a set of data given A tree: P(x* |T, t* ) x*: set of n sequences xj (j=1…n) T : tree with n leaves, with sequence j at leaf j t* : edge lengths of the tree
8.1 Introduction Example
Probabilistic Approaches to Phylogeny Contents • Introduction/Overview Wouter • Probabilistic Models of Evolution Wouter • Calculating the Likelihood Wouter • Pause • Evolution Demo Thomas • Using the likelihood for inference Thomas • Phylogeny Demo Thomas • Summary/Conclusion Thomas • Questions
8.2 Probabilistic Models of Evolution Given the sequence at the leafs x1…xn: • Pick a model of evolution: P(x |y,t),P(x) • Enumerate all possible tree topologies with n leaves • For each T, maximize over all possible edge lengths t: • Pick the T and t that have the largest probability
8.2 Probabilistic Models of Evolution Simplifying Assumptions: • Single base substitions only: ungapped alignments only • Each base evolves independently with the same model of evolution based on a substitution matrix
8.2 Probabilistic Models of Evolution Substitution Matrix for Phylogeny Many important families of substitution matrices are multiplicative: S(t)S(s) = S(T+s) Substitution matrices used in Phylogeny: • Jukes & Cantor Model [1969] • Kimura DNA Model [1980] • PAM Matrix [1978]
8.2 Probabilistic Models of Evolution Jukes-Cantor Model
8.2 Probabilistic Models of Evolution Kimura DNA model
8.2 Probabilistic Models of Evolution PAM matrix model
Probabilistic Approaches to Phylogeny Contents • Introduction/Overview Wouter • Probabilistic Models of Evolution Wouter • Calculating the Likelihood Wouter • Pause • Evolution Demo Thomas • Using the likelihood for inference Thomas • Phylogeny Demo Thomas • Summary/Conclusion Thomas • Questions
8.3 Calculating the likelihood for ungapped alignments Example: The likelihood of two nucleotide sequences
8.3 calculating the likelihood for ungapped alignments Likelihood for general case Where node α(i) is the ancestor of node i A fixed set of values t1…t2n-1 and topology T is required
8.3 calculating the likelihood for ungapped alignments Likelihood for general case Where node α(i) is the ancestor of node i A fixed set of values t1…t2n-1 and topology T is required
8.3 calculating the likelihood for ungapped alignments Felsenstein’s recursive algorithm Define a table of probabilities Fk,a for each site u and all tree nodes k and input characters a: = probability at a site u for subtree below node k assuming character u at node k is a
8.3 calculating the likelihood for ungapped alignments Felsenstein’s recursive algorithm
8.3 calculating the likelihood for ungapped alignments Likelihood for general case Overall algorithm: • Enumerate each tree topology t • Enumerate sets of values t (using some n-dimensional optimisation technique) • Run Felsenstein’s recursive algortihm for each site u and multiply likelihoods • Return best T&t
8.3 calculating the likelihood for ungapped alignments Reversibility & independence of root position • The score of the optimal tree is independent of the root position if and only if: - the substitution matrix is multiplicative - the substitution matrix is reversible • A substititution matrix is reversible if for all a,b and t:
Probabilistic Approaches to Phylogeny Contents • Introduction/Overview Wouter • Probabilistic Models of Evolution Wouter • Calculating the Likelihood Wouter • Pause • Evolution Demo Thomas • Using the likelihood for inference Thomas • Phylogeny Demo Thomas • Summary/Conclusion Thomas • Questions
Probabilistic Approaches to Phylogeny Contents • Introduction/Overview Wouter • Probabilistic Models of Evolution Wouter • Calculating the Likelihood Wouter • Pause • Evolution Demo Thomas • Using the likelihood for inference Thomas • Phylogeny Demo Thomas • Summary/Conclusion Thomas • Questions
Probabilistic Approaches to Phylogeny Contents • Introduction/Overview Wouter • Probabilistic Models of Evolution Wouter • Calculating the Likelihood Wouter • Pause • Evolution Demo Thomas • Using the likelihood for inference Thomas • Phylogeny Demo Thomas • Summary/Conclusion Thomas • Questions
8.4 Using the likelihood for inference Maximum likelihood: • The best tree “could be “ the tree that maximises the likelihood • Computationally demanding
8.4 Using the likelihood for inference Sampling from the posterior distribution: • We use Bayes’ rule to compute the posterior probability • This is the probability of a model given the data
8.4 Using the likelihood for inference Example Model name prior chance of model data Model 1 10 100% A Model 2 40 50% A 50% B Model 3 50 100% B
8.4 Using the likelihood for inference Sampling from the posterior distribution: • We use Bayes’ rule to compute the posterior probability • This is the probability of a model given the data 33 100 10 30
8.4 Using the likelihood for inference Metropolis algorithm • It samples from the trees with probabilities given by their posterior distribution. • It is a sampling procedure that generates a sequence of trees, each from the previous one.
8.4 Using the likelihood for inference Metropolis algorithm
8.4 Using the likelihood for inference A proposal distribution 4 2 7 5 Time from root 6 3 8 1 Order of traversal
8.4 Using the likelihood for inference Metropolis algorithm 4 2 7 5 Time from root 6 3 8 1 Order of traversal
8.4 Using the likelihood for inference Metropolis algorithm 4 2 7 5 Time from root 6 3 8 1 Order of traversal
8.4 Using the likelihood for inference Metropolis algorithm 4 2 7 5 Time from root 6 3 8 1 Order of traversal
8.4 Using the likelihood for inference Metropolis algorithm 4 2 7 5 Time from root 6 3 8 1 Order of traversal
8.4 Using the likelihood for inference Metropolis algorithm
8.4 Using the likelihood for inference Other phylogenetic uses of sampling AATC AATT
8.4 Using the likelihood for inference Other phylogenetic uses of sampling AATC AATC AATT
8.4 Using the likelihood for inference Other phylogenetic uses of sampling AATT TTAA
8.4 Using the likelihood for inference Other phylogenetic uses of sampling AAAA AATC TCAA AATC AATT TTAA TCAA
8.4 Using the likelihood for inference Other phylogenetic uses of sampling • Inferring the history of populations Probability density of a coalesence in time = Probability of a coalesence between any pair = * =
8.4 Using the likelihood for inference Inferring the history of populations • When the value of n is large and the value of p is close to 0 the binomial distribution with parameters n and p can be approximated by a Poisson distribution with mean n*p n*p = = and x = 1 The probability of a coalesence at the end of the period tk The total probability of the tree
8.4 Using the likelihood for inference The bootstrap • The bootstrap can give a approximation to the posterior. • To much labour, so it is an unattractive alternative for sampling. • The bootstrap is probably more useful for non-probabilistic tree building methods.
Probabilistic Approaches to Phylogeny Contents • Introduction/Overview Wouter • Probabilistic Models of Evolution Wouter • Calculating the Likelihood Wouter • Pause • Evolution Demo Thomas • Using the likelihood for inference Thomas • Phylogeny Demo Thomas • Summary/Conclusion Thomas • Questions
Probabilistic Approaches to Phylogeny Contents • Introduction/Overview Wouter • Probabilistic Models of Evolution Wouter • Calculating the Likelihood Wouter • Pause • Evolution Demo Thomas • Using the likelihood for inference Thomas • Phylogeny Demo Thomas • Summary/Conclusion Thomas • Questions
Probabilistic Approaches to Phylogeny Conclusion • The methods of today can be used to find the most probable tree. • Most of the methods were computationally demanding • More realistic evolutionary models are explained Thursday
Probabilistic Approaches to Phylogeny Contents • Introduction/Overview Wouter • Probabilistic Models of Evolution Wouter • Calculating the Likelihood Wouter • Pause • Evolution Demo Thomas • Using the likelihood for inference Thomas • Phylogeny Demo Thomas • Summary/Conclusion Thomas • Questions
Probabilistic Approaches to Phylogeny Questions????