340 likes | 460 Views
Quantifying uncertainty in species discovery with approximate Bayesian computation (ABC): single samples and recent radiations. Mike Hickerson University of California, Berkeley Chris Meyer Museum of Vertebrate Zoology Craig Moritz. Outline
E N D
Quantifying uncertainty in species discovery with approximate Bayesian computation (ABC): single samples and recent radiations Mike Hickerson University of California, Berkeley Chris Meyer Museum of Vertebrate Zoology Craig Moritz
Outline Introduction - Species Discovery Potential problems - Simulations Potential problems - Empirical data Potential statistical solutions
Match new specimen’s DNA “barcode” to voucher specimens with barcodes in database
Proposed genetic thresholds for discovery Comparing sample to closest sister taxon in reference database 1.Hebert’s 10X rule between species divergence must be > 10 times the average within species divergence 2. Reciprocal Monophyly
Noisy Problem Species Tree ≠ Gene Tree Usually a “near miss” Species A Species B Species C 4 Sampled Individuals Species C
Doubly Noisy Problem (mtDNA Barcode locus) Genetic Threshold Equal? Species Delimitation Criteria Moving Target (Mental Construct?)
Doubly Noisy Problem Not sensitive enough (mtDNA Barcode locus) Under-Discovery Genetic Threshold too sensitive Over-Discovery Equal? Species Delimitation Criteria Moving Target
Joint Simulation Exploration DNA-barcode gene (mtDNA, CO1 690 bp) Simple BDM Model of Reproductive isolation: (Bateson-Dobzhansky-Muller) Coalescent model Problematic parameter space? Potential statistical solutions?
BDM Model (Bateson-Dobzhansky-Muller) Genotype A , b OK a , b A, B a, B Bad Neutral and divergent selection (Gavrilets 2004) Speciation events - Poisson process
BDM loci Barcode locus (mtDNA) Divergence Time (generations) Island/Continent (peripatric)
Reciprocal monophyly Threshold Hickerson et al. 2006 (in press; Systematic Biology)
Coyne and Orr 1997 10X Not Species
Coyne and Orr 1997 Not Species
Coyne and Orr 1997 Presgraves 2002 Mendelson 2003 Bolnick and Near 2005 Zigler et al. 2005 Sasa et al. 1998
Move beyond “Yes/No” answers: Nielsen and Metz 2005 Bayesian posterior probabilities w/ ABC -answers with quantified uncertainty -very fast (< 30 seconds per query) -flexible (parameter threshold, model and prior changes according to taxonomic group) = moderate support for new species Migration Isolation time
Prior, parameter threshold and operative model is adjustable as appropriate for particular taxonomic group ? Mymarommatid wasps (10 rare living fossil species) African Cichlids (recent radiation)
Ongoing Work Extension of msBayes software pipeline Determining appropriate priors, thresholds and models Testing: Simulated data -Yule model (stochastic speciation/extinction) Empirical data - Chris Meyer (marine taxa)
Simulated data -Yule model Speciation and extinction follows a random birth/death process Time Speciation Extinction
Test = what % of sisters and orphans are detected as new species “discoveries? Test Data 1.Closest Divergence times - Sister’s and Orphans 2. Population sizes - Gamma distributed 50K-2.5M 3. Single specimens from “new” species 3,5,10,20, and 40 specimens from reference species Orphan Sister-pair
Yule model Empirical Data (Cowries) 100 lineages per clade 135 lineages Is it a new species? Function of Posterior Probability of divergence Time and gene flow Discovery? Reference Species
msBayes Software pipeline SIMULATE 1,000,000 \ draws from model Flexible Pre-simulated prior ABC observed data Accept 0.2% Posterior probability surface ~< 1 minute
Approximate Bayesian Computation (ABC) Posterior Prior
Parameter threshold? Posterior Prior
M1= yes, new species M2= no, same old species f (M1given Data) f (M2given Data) Bayes Factor = prior (M1) prior (M2) A way to compare evidence for these 2 discrete models
Very Near Future 1. Better priors Species divergence time AND intra-species coalescence 2. Incorporate Migration • Hierarchical Model New species status Hyper-Parameter Yes No Hyper-Prior Prior(T,N) Prior(N, T=0)
ACKNOWLEDGEMENTS Discussion C. Moritz C. Meyer T. Mendelson K. Zigler N. Rosenberg J. Degnan Coauthors C. Meyer C. Moritz cpu resources J. McGuire Museum of Vertebrate Zoology Funding NSF DIMACS