360 likes | 966 Views
Molecular data for each group, allowing us to integrate over phylogenetic uncertainty ... The Canary Islands. Canary Islands 3-island model. Western. Central ...
E N D
1. Bayesian Island Biogeography
Markov model of biogeographic processes shared across groups Group-specific dispersal rates Molecular data for each group, allowing us to integrate over phylogenetic uncertainty Separate strict clock tree for each group Using MrBayes 4
2. General Time Reversible (GTR) model
to from Stationary state frequency of state i Exchangeability rate for states i and j
3. GTR island biogeography model
to from Relative carrying capacity of island i Relative dispersal rate between islands i and j
4. Dispersal from island i to island j
Relative carrying capacity of source island Relative dispersal rate between the islands Relative carrying capacity of receiving island Number of species in the system Group-specific dispersal rate
A C B D5. Island GTR model
All carrying capacities unequal All dispersal rates unequal
A C B D6. Island GTR step model
All carrying capacities unequal All dispersal rates unequal between adjacent islands, zero between nonadjacent islands
A C B D7. Island JC step model
All carrying capacities equal Dispersal rates equal between adjacent islands, zero between nonadjacent islands
8. The Canary Islands
9. Canary Islands 3-island model
Western Central Eastern Mainland
10. Data Set
9 groups, 575 species in total Three island groups corresponding to the three main magmatism periods Outside area: Iberia, North Africa, Macaronesia Six different models: JC, JC step, equalin, equalin step, GTR, GTR step
11. Dolichoiulus (Diplopoda)
46 endemic species Dolichoiulus (Diplopoda, Julida, Julidae, Pachyulinae) mtDNA COI cytB 16S
12. Model
DNA data 1 GTR1 µ1 m1 T1 DNA data 2 GTR2 µ2 m2 T2 DNA data 3 GTR3 µ3 m3 T3 IM distribution IM – island model µi – mutation rate mi - dispersal rate
A C13. Canary Islands – GTR model
D B Western Central Eastern Mainland
14. More models…
Mixture models Hidden Markov Models (HMM) Model jumping Dirichlet Process Mixture Models Indel Model
Number of substitution types (K) Posterior probability Model averaging (reversible jump MCMC) over all possible submodels of the GTR model Huelsenbeck et al., 2004, MBE16. Bayesian Caveats
Prior probability distributions Use common sense and examine the posterior distributions of all parameters Assessing convergence Always compare the results of independent analyses started from different, randomly chosen trees Model choice Never fix parameters according to a model testing program like ModelTest Model adequacy Test model adequacy if possible
17. Bayesian Critiques
We can use simple methods with massive amounts of data
Consistent method Inconsistent method Number of sites Probability correct As data accumulates, problems with a method will become more serious19. Bayesian Critiques
We can use simple methods with massive amounts of data Current evolutionary models are too simple; it is better to use methods not based on model assumptions
UCM NCM Goldman GTR+I+? JC Super model Parsimony-like models Super model could potentially account for: Evolution at the codon, amino acid and nucleotide levels Insertion and deletion Correlation across sites Process heterogeneity across sites and across tree Rate heterogeneity across tree 3D structure Model space for a data set with 100 taxa and 1000 sites Strict clock models Standard non-clock models # parameters21. Bayesian Critiques
We can use simple methods with massive amounts of data Current evolutionary models are too simple; it is better to use methods not based on model assumptions Bayesian inference is statistically inconsistent
22. Bayesian Critiques
We can use simple methods with massive amounts of data Current evolutionary models are too simple; it is better to use methods not based on model assumptions Bayesian inference is statistically inconsistent Bayesian posterior probabilities are biased (slight underestimates) even when the model is correct
Model is correct Huelsenbeck and Rannala, Syst Bio, 2004 Uniform(0,10) prior probability on branch lengths A given difference in LnL translates to strong preference for the best tree The same difference in LnL translates to weaker preference for the best tree Why Bayesian posteriors have been reported as being biased under the correct model25. Bayesian Critiques
We can use simple methods with massive amounts of data Current evolutionary models are too simple; it is better to use methods not based on model assumptions Bayesian inference is statistically inconsistent Bayesian posterior probabilities are biased (slight underestimates) even when the model is correct Bayesian inference is sensitive to model misspecification; non-parametric methods are more robust
Is Bayesian inference overly sensitive to model misspecification? Model is overparameterized Huelsenbeck and Rannala, Syst Bio, 2004 Model is oversimplified Huelsenbeck and Rannala, Syst Bio, 2004 Maximum likelihood is less sensitive to model misspecification than Bayesian inference FALSE There are actually reasons to believe that the opposite is true because of the need to estimate nuisance parameters in the likelihood approach. Nevertheless, the differences between the two approaches are going to be small. Both are parametric approaches and rely on the model being reasonable. Nonparametric bootstrapping can fix the model sensitivity of the Maximum likelihood approach Original characters Pseudosampling 100-1000 times ML analysis ML analysis ML analysis Non-parametric Bootstrapping Estimate of uncertainty MAYBE Nonparametric bootstrapping can reduce the confidence in incorrect groups when there is process heterogeneity across sites. It cannot fix problems with dependence among sites or process heterogeneity across the tree. Nonparametric bootstrapping can be applied both to Maximum likelihood and Bayesian inference. Nonindependence affects the bootstrap Probability that a well-supported (> 80 % BP) MP tree is correct When sites are linked, the probability that a well supported tree is incorrect increases from 4 % to 25 % Galtier 2004, Syst Bio35. “Fixing” Bayesian Clade PPs
Put more prior probability on very short interior branches: short branches become unsupported Put prior weight on polytomies: short branches become unsupported Apply the non-parametric bootstrap Use a better model
36. Bayesian Software
Model testing ModelTest MrModelTest MrAIC Convergence diagnostics AWTY Tracer Phylogenetic inference MrBayes BEAST BayesPhylogenies PhyloBayes Phycas BAMBE Specialized inference PHASE BAliPhy BayesTraits Badger Tree drawing TreeView FigTree