190 likes | 359 Views
Practical Session: Bayesian evolutionary analysis by sampling trees (BEAST). Rebecca R. Gray, Ph.D. Department of Pathology University of Florida. BEAST: is a cross-platform program for Bayesian MCMC analysis of molecular sequences
E N D
Practical Session: Bayesian evolutionary analysis by sampling trees (BEAST) Rebecca R. Gray, Ph.D. Department of Pathology University of Florida
BEAST: • is a cross-platform program for Bayesian MCMC analysis of molecular sequences • entirely orientated towards rooted, time-measured phylogenies inferred using strict or relaxed molecular clock models • can be used as a method of reconstructing phylogenies, but is also a framework for testing evolutionary hypotheses without conditioning on a single tree topology • uses MCMC to average over tree space, so that each tree is weighted proportional to its posterior probability
Citations • The recommended citation for this program is: • Drummond AJ, Rambaut A (2007) "BEAST: Bayesian evolutionary analysis by sampling trees." BMC Evolutionary Biology7:214 • To cite the relaxed clock model in BEAST: • Drummond AJ, Ho SYW, Phillips MJ & Rambaut A (2006) PLoS Biology4, e88 • To cite the Bayesian Skyline model in BEAST: • Drummond AJ, Rambaut A & Shapiro B and Pybus OG (2005) Mol BiolEvol22, 1185-1192 • The original MCMC paper was: • Drummond AJ, Nicholls GK, Rodrigo AG & Solomon W (2002) Genetics161, 1307-1320
Basic Pipeline • 1) setting up xml file (beauti) • 2) running xml file (beast) • 3) evaluating the performance of the run (Tracer) • 4) comparing models, obtaining estimates of parameters (Tracer) • 5) summarizing the tree distribution (TreeAnnotator) • 6) viewing MCC tree (Figtree)
Downloading programs • http://beast.bio.ed.ac.uk/Main_Page\ • Download contains beauti, BEAST, TreeAnnotator • http://beast.bio.ed.ac.uk/Tracer • http://beast.bio.ed.ac.uk/FigTree
Epidemiology of RVF • The virus was first identified in 1931 in the Rift Valley of Kenya • Mosquito vector, primarily infects livestock • 1997–1998, a major outbreak occurred in Kenya, Somalia and the United Republic of Tanzania • September 2000 cases were confirmed in Saudi Arabia and Yemen (first reported occurrence of the disease outside the African continent)
Setting up xml file in beauti • Requires a nexus file • Helpful to have dates with the sample name • Use the finest resolution available • GUI interface allows basic selection of parameters • Xml file can be manually edited to test specific hypotheses/tweak run
Beauti practical • Import alignment (g_63.nex) • Tip dates – use tipdates, guess dates (years since some time in the past) • Site models – use GTR + G, empirical base frequencies • Test hypothesis of strict vs. relaxed molecular clock • Trees – coalescent tree prior – constant size • 5 x 107 generations
BEAST • Open xml file with text editor • Run in beast • Check mixing of the MCMC chain • Open S log files in Tracer • Open L and G2 log files • What can we do about the trace??
Proper mixing • First step – run chain longer • Open L200 files • Other steps to try: • Over parameterization – reduce complexity • Temporal/phylogenetic signal • Priors are inappropriate
Model testing • Bayes factors: • Compare estimates of the marginal likelihoods of the models of interest • 2*(ln marginal likelihood model 1 – ln marginal likelihood model 2) • >10, strong support for alternative (more complex model) • Strict clock vs. relaxed clock • Also consider the coefficient of variation
Summarizing tree • TreeAnnotator • Burnin 10% (501 samples) • Keep median heights • MCC tree • Visualizing tree: FigTree • Posterior probabilities for branches • Median heights for clades of interest
Advanced analyses • Different coalescent priors • Parametric models (exponential, logistic) • Bayesian skyline plots • Phylogeography • Lemey et al, 2009, Plos Computational Biology • Site specific rates of variation
Change in effective population size over time Log10 Ne Log10 Ne
1916 (1868-1942) Bayesian Genealogy Of G Gene
Additional resources • Tutorials on the beast website, google group • 16th International BioInformatics Workshop on Virus Evolution and Molecular Epidemiology • Johns Hopkins University, Baltimore • 29 August - 03 September 2010, Bethesda, USA • http://www.rega.kuleuven.be/cev/workshop/