1 / 35

Biological inferences from barcoding data Timothy G. Barraclough

Biological inferences from barcoding data Timothy G. Barraclough. Establishing a standard DNA barcode for land plants. Describing and explaining biological diversity Traditional taxonomy: slow and subjective. Describing and explaining biological diversity

osias
Download Presentation

Biological inferences from barcoding data Timothy G. Barraclough

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Biological inferences from barcoding data Timothy G. Barraclough Establishing a standard DNA barcode for land plants

  2. Describing and explaining biological diversity Traditional taxonomy: slow and subjective

  3. Describing and explaining biological diversity Traditional taxonomy: slow and subjective Evolutionary methods: model systems

  4. Describing and explaining biological diversity Traditional taxonomy: slow and subjective Evolutionary methods: model systems Barcoding data: Large samples within and between species 

  5. Describing and explaining biological diversity Traditional taxonomy: slow and subjective Evolutionary methods: model systems Barcoding data: Large samples within and between species  Single marker; lacking conceptual basis; X biological relevance?

  6. Analysing barcoding data Empirical approaches: Thresholds; pairwise distances; accuracies OK for species I.D. but limited for evolutionary inference. Assumes prior knowledge of species.

  7. Analysing barcoding data Empirical approaches: Thresholds; pairwise distances; accuracies OK for species I.D. but limited for evolutionary inference. Assumes prior knowledge of species. Population genetics approaches: Statistical tests of predicted signatures of no gene flow between populations

  8. Population genetics approaches Pros: biological inference, large body of theory

  9. Population genetics approaches Pros: biological inference, large body of theory Cons: - assume neutral coalescence - prior informal species limits - single marker: developed for multi-locus - computationally intensive

  10. E.g. Rivacindela tiger beetles on salt lakes in Australia sequence 5 individuals per morphotype per salt lake for mtDNA Pons, J. et al. In press. Systematic Biology

  11. Genetic signatures of species/speciation Establishment Time Data needed 1. Allele frequencies <0.5N but* prior groups 2. Fixed differences prior groups 3. Monophyly prior groups 4. Genealogical 2 or more conconcordance unlinked markers 5. Clusters > 1N 1 marker

  12. Likelihood method testing for significant clusters Among-species branching Within species branching

  13. Among-species branching = speciation rate, extinction rate, how they vary over time sampling, reconstruction biases

  14. Within species branching = Coalescence: population size, demographic and selective history, sampling/artefacts?

  15. Birth-death branching models x1 x2 x3 Log (Number of lineages) Relative time since root node Barraclough, T.G. and Nee, S. 2001. Trends Ecol Evol. 16:391-399

  16. Among-species branching, Yule model Lik(t) = ne-nx x is waiting interval, n number of lineages during interval  is per lineage branching rate

  17. Coalescent theory E.g. Human demographic and selective history Kingman, Hudson, etc. etc.

  18. Within species branching, neutral coalescent

  19. Among-species branching 1 Within species branching 2 Likelihood method testing for significant clusters => Compare with no-threshold, single entity model

  20. Complication 1 How to account for infinite range of possible models without fitting and testing all of them? Solution Add two scaling parameters optimized to accommodate a large range of specific models

  21. Generalized Yule model Lik(t) = npe-npt Among species: p = 1, constant speciation rate no extinction p > 1, constant background extinction or recent burst of speciation p < 1, slowdown model or incomplete sample of species Within species: p = 2, neutral coalescent p > 2, declining populations, recent selective sweep p < 2, growing populations or balancing selection

  22. Complication 2 Allow for mixture of processes at different times: most recent speciation event could post-date oldest within-species branch Solution Likelihoods under mixed model

  23. Model: conclusions • General likelihood model for set of within-species branching processes linked by between-species branching. • (written in R statistical programing language) • Define or optimise species nodes • Estimate key parameters, e.g. changes through time • Hypothesis testing • Confidence intervals

  24. Examples of use Australian tiger beetles Ancient asexual rotifers, bdelloids Barcoding, e.g. plants

  25. Rivacindela tiger beetles on salt lakes Sampled 5 individuals per morphotype per salt lake

  26. mtDNA tree, 468 individuals, 47 ‘species’ Joan Pons, Jesus Gomez-Zurita, Anabela Cardoso, Daniel Duran, William Sumlin, Alfried Vogler

  27. Method Numb. species 1. Allele frequencies Fst 51 2. Fixed differences PAA 46 3. Monophyly Wiens-Penkrot 47

  28. Likelihood method 48 species (+ 3 /- 1) Missed embedded species Recovered single individuals

  29. Assumes same population parameters for each species, • Repeated allowing them to vary across species and three categories of values: significantly better fit • Parameter values suggest: • Deficit of recent coalescent events across species • Growing populations, past bottleneck Surprisingly constant levels of variation across species • Bottleneck again? Aridification Speeding up of apparent speciation rate towards the present

  30. Current work: Optimisation of species nodes without assuming a threshold Model does not assume threshold, but easiest way to optimise Computationally intensive…

  31. Rotifers Significant fit to transition model 282 clusters (C.I. 273 - 294) P<<0.0001

  32. Barcoding: Could use approach to delimit species, e.g. marine bacteria, viruses, ericoid mycorrhiza Probability of sequence belonging to “species” X, or probability of not belonging to any existing species (repeat across bootstrap/Bayes trees) Global success of barcoding? incomplete samples, low speciation v. N

  33. How many ambiguous species? Clade of 100 species of annual plants Average effective population sizes of N Speciation rate of lambda per species per myr Tmrca = 1N => more recent speciation events ambiguous w.r.t plastid DNA To have fewer than 5 ambiguous sister pairs Lambda < 0.05 Myr-1 [N = 1 million] Lambda <5 Myr-1 [N = 10000]

  34. Conclusions Can use barcode type data to delimit species [limitations] Can use framework to assess, predict, quantify errors for barcode approaches Multiple unlinked markers, RI, morphology

  35. Acknowledgements Mark Chase, Robyn Cowan Alfried Vogler, Sean Nee Elisabeth Herniou NERC, Royal Society, Sloan and Moore Foundations, CBOL

More Related