210 likes | 333 Views
Maximum likelihood (cont.). Midterm. Relatedness – is not sharing a common ancestor but sharing a relatively recent CA Homoplasy includes convergence and/or reversal PTP looks at tree length ; g1 looks at skew in tree length Both introgression and LGT result in a dominant and minor history.
E N D
Midterm • Relatedness – is not sharing a common ancestor but sharing a relatively recent CA • Homoplasy includes convergence and/or reversal • PTP looks at tree length; g1 looks at skew in tree length • Both introgression and LGT result in a dominant and minor history
Midterm • ILD test looks at the sum of the lengths (or likelihood) of the optimal trees from each partition • Topology tests evaluate whether the data support one tree better than another. It can be used to evaluate support for a clade or to assess discordance, but that is an application
Midterm • Parsimony criterion: Tree that can explain the data with the lowest number of character state changes (weighted by the inferred evidential value of each character state transition) • Assumptions: Single; independence; branch lengths short and fairly even
A B C Split 2 Split 1 A-B coalescence AB-C coalescence The cause is the retention of a polymorphism – does not depend on gene duplication
9 6 6 3 5 4 2 1 Relationship among models
Site-to-site rate heterogeneity • Two main methods: • Some proportion of invariant sites • Rates distributed as a discrete approximation to a gamma distribution • Both use one parameter • I = proportion of invariant sites • α = shape parameter
Nested models • Simpler models have fewer parameters than more complex models • Two models are “nested” if all parameters in the simpler model are also in the more complex model • Nested: GTR-HKY; HKY-JC; GTR-JC; • Not nested: F81-K2P; JC+I-HKY
Which of these pairs are nested? • HKY-K2P • GTR+Γ-GTR • HKY+I-HKY+Γ • HKY+I-HKY
In the case of nested models • Log-likelihood under the simpler model = Ls • Log-likelihood under the complex model = Lc • It will always be the case that Lc ≥ Ls • But how much better can we explain the data under the more complex model?
Log-Likelihood ratios • The likelihood ratio is 2(Lc-Ls) • For nested models the expected LR is distributed as a Χ2 with as many degrees of freedom as the number of extra parameters in the more complex model (kc-ks)
Hierarchical LR tests • If the LR is significant under a chi-square then favor the more complex model • If the LR is not significant then stick with the simpler model
Akaike Information Criterion • Another approach to choosing among models • Can be used even among non-nested models • Pick the model with the lowest AIC: • k = number of parameters in model • AIC = -2 ln L + 2k
Relationship between MP and ML • One argument - MP is inherently nonparametric No direct comparison possible • MP is an ML model that makes particular assumptions
Parsimony-like likelihood model(see Lewis 1998 for more) • Estimate branch-length independently for each character (a VERY complex model) • Only sum over maximum likelihood ancestral states
Why use MP • The model is less realistic, but: • We can do more thorough searches and data exploration (computational efficiency) • Robust results will usually still be supported
Why use ML • The model is explicit • We can statistically compare alternative models of molecular evolution • We can conduct parametric statistical tests
Likelihood based topology test • Kishino-Hasegawa test • Likelihood ratio test of zero length branches