410 likes | 556 Views
Comparative methods: Using trees to study evolution. Some uses for phylogenies. Character evolution Ancestral states Trends and biases Correlations among characters Molecular evolution Evidence of selection “Key innovations” Diversification rate. Why reconstruct character evolution?.
E N D
Some uses for phylogenies • Character evolution • Ancestral states • Trends and biases • Correlations among characters • Molecular evolution • Evidence of selection • “Key innovations” • Diversification rate
Why reconstruct character evolution? • Can evaluate homology
Why reconstruct character evolution? • Can evaluate homology • Can determine character-state polarity
Why reconstruct character evolution? • Can evaluate homology • Can determine character-state polarity • Can evaluate the “selective regime” when a character evolved
Bee to bird poll. Adaptation supported Was the ancestor bird pollinated when red flowers evolved?Look at pollinators
Bee to bird poll. Not an adaptation Alternative result
A third possibility Bee to bird poll. Consistent with adaptation
Why reconstruct character evolution? • Can evaluate homology • Can determine character-state polarity • Can evaluate the “selective regime” when a character evolved • Can recreate ancestral genes/proteins
Dinosaur Rhodopsin • Chang et al. (MBE 2002)
Character optimization using parsimony • Pick the reconstruction that minimizes the “cost” • What do you do if more than one most-parsimonious reconstruction • ACCTRAN/DELTRAN • Consider all • What character-state weights should you use?
Cost-change graph(Ree and Donoghue 1998: Syst. Biol. 47:582-588)
What gain:loss weight to use? • If you believe gains are more common (hence weighted less) you will find more gains (and vice versa) • So how can you use a tree to establish if there is a gain:loss bias?
Wing loss and re-evolution? • Whiting et al. (Nature 2003)
A likelihood approach • Developed (in parallel) by Mark Pagel and Brent Milligan in 1994 • Continuous time Markov model • Select the rate of gains (0->1) and rate of losses (1->0) that maximizes the likelihood of the data given a sample tree (and branch lengths)
Transition rate matrix To From
Logic • Calculate the likelihood of the data for a given value of q1 and q2 • Modify q1 and q2 to find a pair of values that maximizes the probability of the data
Probabilities summed across all possible ancestral states 1 1 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0
How much of the likelihood contributed by each state at each node
How much of the likelihood contributed by each state at each node
Are gain and loss rates different? • Likelihood ratio test • Model 1: gains and losses free to vary independently • Model 2: gains and losses equal • How many degrees of freedom?
The likelihood method • Provides a method for using the data to evaluate gain:loss bias • Takes account of branch lengths • Still sensitive to taxon sampling
Suppose this taxon contains 5000 species 1 1 0 0 0 1 1 1 0 0 Suggests that the rate of losses is low
Suppose this taxon contains 5000 species 1 1 0 0 0 1 1 1 0 0 Suggests that the rate of gains is low
Correlated evolution • Look at pairs of traits (where one trait can be an environment) • Body size and range size • Warning coloration and gregariousness • Fleshy fruit and dioecy • Do these traits evolve non-independently?
Causes of non-independence • Developmental “connectedness” • Adaptation (Correlated evolution has been claimed to be the best evidence for evolution by natural selection)
Non-phylogenetic (“tip”) method • Count species • Do a chi-square test
Hypothetical tree Eyes g b g g b b Fur d d p p d p 150 100
Proposed solutions for discrete characters • Do a chi-square test of changes rather than tip-states (various approaches) - Ridley; Sillen-Tullberg • Use a Monte Carlo approach to ask if changes of the dependent variable are biased relative to expectations from changes placed on the tree at random - W. Maddison
Maddison test Probability that this pattern or a more extreme pattern could arise without fruit type affecting seed number is ca. 8%.
Problems with the Maddison test • Requires one to define dependent and independent characters • Does not take account of branch-length • Very sensitive to inclusion/exclusion of species
Procedure • Estimate the set of rates in the q-matrix that maximize the likelihood of the data and calculate that likelihood • Constrain the matrix so that it represents independence (q12 = q34; q13 = q24; q21 = q43; q31 = q42) and repeat the calculation • Use a likelihood ratio test to evaluate significance
Issues to consider • Rejection of independence does not tell you what kind of non-independence you have • You need reasonable branch lengths • Sampling matters (if perhaps less than parsimony)