1 / 23

Correlating traits with phylogenies

Correlating traits with phylogenies. Using BaTS. Phylogeny and trait values. A phylogeny describes a hypothesis about the evolutionary relationship between individuals sampled from a population Discrete character traits of interest can be mapped onto the phylogeny

yvon
Download Presentation

Correlating traits with phylogenies

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Correlating traits with phylogenies Using BaTS

  2. Phylogeny and trait values • A phylogeny describes a hypothesis about the evolutionary relationship between individuals sampled from a population • Discrete character traits of interest can be mapped onto the phylogeny • A significant association between a particular trait value and its distribution on a phylogeny indicates a potential causative relationship

  3. Phylogeny and trait values • A phylogeny describes a hypothesis about the evolutionary relationship between individuals sampled from a population

  4. Phylogeny and trait values • Discrete character traits of interest can be mapped onto the phylogeny

  5. Phylogeny and trait values • A significant association between a particular trait value and its distribution on a phylogeny indicates a potential causative relationship

  6. Phylogeny and trait values • Often, the phylogeny-trait relationship does not appear unequivocal by eye: an analytical framework may be needed. (clear association) (no association) ????

  7. Phylogeny and trait values The null hypothesis The null hypothesis under test is one of random phylogeny-trait association; that is, that “No single tip bearing a given character trait is any more likely to share that trait with adjoining taxa than we would expect due to chance”

  8. An example • Salemi et al (2005)*: Dataset of HIV sequences sampled from CNS tissues post mortem • Analysis by Slatkin-Maddison (1989) method, reanalyzed in BaTS**. • Compartmentalization by tissue type: circulating viral populations defined by location in the body: *Salemi et al. (2005) J. Virol79(17): 11343-11352. **Parker, Rambaut & Pybus (2008) MEEGID8(3):239-246.

  9. Available methods • Non-phylogenetic: ANOVA • Ignores shared ancestry • Phylogenetic: • Single tree mapping • Slatkin-Maddison & AI • BaTS

  10. Methods: Single-tree mapping • Method: • Map traits onto a tree • Look for correlation • Pros: • Fast • Simple • Cons: • No indication of significance • Statistically weak (high Type II error) • Conditional on a single topology

  11. Methods: Slatkin-Maddison & AI • Method: • Map traits onto a tree by parsimony & count migration events (Slatkin-Maddison) or measure ‘association index’ within clades recursively (AI) • Compare observed value with a null (expected) value obtained by bootstrapping • Pros: • Still reasonably fast • Indication of significance • Cons: • Still conditional on a single topology

  12. Methods: BaTS • Method: • See below(!) • Pros: • Indication of significance • Statistically powerful and Type I error is correct • Accounts for phylogenetic uncertainty • Cons: • Requires Bayesian MCMC sequence analysis • Slower

  13. BaTS: under the bonnet • Use a posterior distribution of phylogenies from Bayesian MCMC analysis • Calculates migrations, AI and a variety of other measures of association • Both observed and expected (null) values’ posterior distributions sampled • Significance obtained by comparing observed vs. expected

  14. BaTS: analysis workflow • Preparation: • Sequence alignment • Bayesian MCMC phylogeny reconstruction (BEAST, MrBAYES) to obtain posterior distribution of trees (PST) • Taxa in PST marked up with discrete traits • BaTS analysis • Interpretation

  15. Workflow: Preparation (i) • Sequence alignment: • CLUSTAL, BioEdit, SE-Al • Bayesian MCMC analysis: • MRBAYES, BEAST • Taxa marked-up with traits

  16. Workflow: Preparation (ii) • Taxa marked-up with traits: Typical NEXUS format:

  17. a) Declare ‘states’ block begin states; b) Assign a trait to each taxon in the order that they appear in the original #NEXUS file c) Close the ‘states’ block. d) Omit ‘translate’ and ‘taxa’ blocks. Workflow: Preparation (iii) • Taxa marked-up with traits:

  18. Workflow: BaTS analysis To use BaTS from the command-line, type: java –jar BaTS_beta_build2.jar [single|batch] <treefile_name> <reps> <states> Where: single or batch asks BaTS to analyse either a single input file, or a whole directory (batch analysis) <treefile_name> is the name and full location of the treefile or directory to be analysed, <reps> is the number (an integer > 1, typically 100 at least) of state randomizations to perform to yield a null distribution, and <states> is the number of different states seen.

  19. 30 trees were detected in the input file Output: statstics, one per line, tabulated (housekeeping and debugging messages) The ‘MC…’ statistics are reported in the order in which they occur in the input file The analysis • C:\joeWork\apps\BaTS\BaTS_beta_build2\BaTS_beta_build2>java -jar BaTS_beta_build 2.jar single example.trees 100 7 • Performing single analysis. • File: example.trees • Null replicates: 100 • Maximum number of discrete character states: 7 • analysing... 30 trees, with 7 states • analysing observed (using obs state data) • 30 29 • 30 29 • 30 29 • 30 29 • Statistic observed mean lower 95% CI upper 95% CU null mean lower 95% CI upper 95% CI significance • AI 1.5555052757263184 1.1128820180892944 2.160351037979126 12.03488540649414 11.475320040039 12.6391201928711 0.0 • PS 18.5 17.0 20.0 80.7713394165039 77.86666870117188 83.56666564941406 0.0 • MC (state 0) 12.633333206176758 9.0 16.0 1.7496669292449951 1.399999976158142 2.1666667461395264 0.009999990463256836 • MC (state 1) 19.0 19.0 19.0 1.7480005025863647 1.33333337306976 32 2.0999999046325684 0.009999990463256836 • MC (state 2) 12.666666984558105 12.0 13.0 1.77991247559 1.33333697632 2.200000047683716 0.009999990463256836 • MC (state 3) 8.566666603088379 3.0 11.0 1.66733866943 1.2333333492279053 2.133333444595337 0.009999990463256836 • MC (state 4) 11.0 11.0 11.0 1.5526663064956665 1.16666662693023 68 2.0999999046325684 0.009999990463256836 • MC (state 5) 3.433333396911621 2.0 6.0 1.4840000867843628 1.100000023841858 2.0333333015441895 0.009999990463256836 • MC (state 6) 5.066666603088379 5.0 6.0 1.2973339557647705 1.0333333015441895 1.600000023841858 0.009999990463256836 • done • Done.

  20. Workflow: Interpretation The null hypothesis The null hypothesis under test is one of random phylogeny-trait association; that is, that “No single tip bearing a given character trait is any more likely to share that trait with adjoining taxa than we would expect due to chance”

  21. Workflow: Interpretation The statistics: • Larger values  increased phylogeny-trait association • Significance indicated by p-value • In addition, observed posterior values are informative for some statistics: • PS: indicates migration events between trait values • MC(trait value): indicates number of taxon in largest clade monophyletic for that trait value

  22. FAQs / common pitfalls • Java 1.5 or higher is required. See java.sun.com for more. • Large datasets can be slow, so down-sample input tree files (uniformly, not randomly) where necessary, or to check BaTS input files are marked-up correctly. • A RAM (memory)shortage can slow the analysis, use –Xmx switch to allocate virtual RAM* • Check input file mark-up carefully if in doubt. *See more: http://edocs.bea.com/wls/docs70/perform/JVMTuning.html

  23. Author contact: Joe Parker Department of Zoology Oxford University, UK OX1 3PS joe@kitserve.org.uk http://evolve.zoo.ox.ac.uk

More Related