1 / 42

Tracking the genetic legacy of past human populations through the grid

Tracking the genetic legacy of past human populations through the grid. Nicolas Ray University of Bern / CMPG (University of Geneva & UNEP/GRID-Europe). ECSAC09, Veli Lošinj, August 26 th 2009. Adapted from Cavalli-Sforza & Feldman, 2003. Human migrations. [12,000]. [55,000].

aideen
Download Presentation

Tracking the genetic legacy of past human populations through the grid

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Tracking the genetic legacy of past human populationsthrough the grid Nicolas Ray University of Bern / CMPG (University of Geneva & UNEP/GRID-Europe) ECSAC09, Veli Lošinj, August 26th 2009

  2. Adapted from Cavalli-Sforza & Feldman, 2003 Human migrations [12,000] [55,000] Homo sapiens sapiens

  3. Whyaimingat a good demographic model?

  4. 1. Better understand human evolution • Origin of modern human (when, where, how many?) • Relationship with other members of the Homo genus • 2. Distinguish between the effect of demography and those of selection (biomedical applications)

  5. Observed patterns of genetic diversity in contemporary populations A complex past demography fluctuation in effective pop. size substructure migrations Gene-specific factors mutations recombination selection

  6. Semi-spatial approach

  7. Statistical Evaluation of Alternative Models of Human Evolution Nelson Fagundes, Nicolas Ray, Mark Beaumont, Samuel Neuenschwander, Francisco Salzano, SandroBonatto, and Laurent Excoffier. 2007. PNAS, 104: 17614-17619 • 50 loci in non-genic regions (Chen and Li, 2001) • About 500 bp each, 24,425 bp in total • 30 individuals: 10 Africans, 8 Asians, 12 Amerindians • Chimpanzee sequenced to get estimation of mutation rates assuming 6 My divergence time

  8. ASIG ASIG ASEG ASEG AFRIG AFRIG AFREG AFREG AS AF AM AS AS AS AS AF AF AF AF AS AS AM AM AM AM AF AF AM AM MREBIG MREBIG MREBEG MREBEG MRE1S MRE1S MRE2S MRE2S AS AS AS AS AS AS AS AS AF AF AF AF AF AF AM AM AM AM AM AM AF AF AM AM Models African replacement Assimilation time Multiregional evolution

  9. Model parameters and priors

  10. Asia Africa Americas Simulations Coalescence theory A retrospective model of population genetics Traces all copies of a gene in a sample from a population to a single ancestral copy shared by all members (MRCA) Assumes no recombination, no selection Time

  11. Summary statistics • Within population: • S, p • Between populations • Pairwise FST • Global FST • Globally • S, p ACCTAGTACAATCGGTAATGCCATTGGT Modèle de mutation Mutation Simulated genealogy TCCTTGTA…ATTGGT ACCGAGTA…GTTGGT

  12. Approximate Bayesian Computations (ABC) The rejection-sampling approach: • Calculate summary statistics (S) for observed data sets • Draw parameter values φ’from prior distributions, and use them to simulate data • Calculate summary statistics (S’) on the simulated data set and compare them to the observations: δ =||S - S’|| (Euclidean distance) • Accept φ’ ifδis arbitrarily small, otherwise reject sample • The ABC approach (Beaumont et al. 2002) • Modification: a local regression is added within the set of accepted φ’ values

  13. Neuenschwander (2006)

  14. Computational issues Computer clusters 1-10 mio. UBELIX (>500 nodes) Zooblythii (~40 nodes)

  15. For ABC, 5 mio. demographic simulations are necessary to obtain robust parameter estimations Each demographic simulation is followed byn genetic simulations (n = num. of loci) Example 8 simple models, 50 loci, 30 individuals 2 CPU-year

  16. ASIG ASIG ASEG ASEG ASEG ASEG AFRIG AFRIG AFREG AFREG AFREG AFREG AS AF AM AS AS AS AS AS AS AF AF AF AF AF AF AS AS AS AS AM AM AM AM AM AM AF AF AF AF AM AM AM AM 0.091 0.909 0.042 0.958 MREBIG MREBIG MREBIG MREBIG MREBEG MREBEG MRE1S MRE1S MRE2S MRE2S AS AS AS AS AS AS AS AS AS AS AF AF AF AF AF AF AF AF AM AM AM AM AM AM AM AM AF AF AM AM 0.422 0.048 0.069 0.461 Relative probabilities of models of human evolution African replacement Assimilation 0.781 0.001 Multiregional evolution 0.218

  17. AFREG AM AS AF Out-of-Africa time 51.1 Kya (40.1 – 70.9) Americas colonization time Speciation time 10.3 Kya (7.6 – 15.9) 142 Kya (104 – 186)

  18. Fully spatial approach

  19. Adapted from Cavalli-Sforza & Feldman, 2003 A complex demography [10,000] demographic and spatial expansions [55,000] population bottlenecks secondary contacts population isolation fast migration events

  20. Carrying capacity low high From environment to demography Spatial resolution: 100 km

  21. From environment to demography Friction low high

  22. Cell or deme Pop. size time Demographic simulations stepping-stone model (cellular automata)

  23. SPLATCHE SPatiaLAnd Temporal Coalescences in Heterogeneous Environment (http://cmpg.unibe.ch/software/splatche)

  24. Vegetationmaps Present potential vegetation Taking into account altitudes Expert system present potential Ray et Adams. 2001. Internet Archaeology 11 Vegetation at the Last Glacial Maximum Last Glacial Maximum

  25. Demography and spatial expansion Population density

  26. Dynamicvegetation intermediate PP LGM

  27. Genetic simulations

  28. Computational issues A fully spatially-explicit model using 500 loci in 800 individuals:  10 CPU-years Adding long-distance dispersal:  20 CPU-years

  29. SPLATCHE on the grid • early 2005: joined the Biomed VO of the EGEE project • mid 2005: tested on GILDA test bed, and deployed on the Grid • since late 2005: testing and improvement • since mid 2006: production mode and optimization

  30. GRID Statistical tools Posterior distribution of demographic/genetic parameters of interest Use of SPLATCHE on the grid N simulations • Independent simulations: • the more CPUs, the better • job failures are not that bad

  31. Submission time multi-threaded application using up to 30 RBs (used for the WISDOM project) Fetching time of job outputs in-house multi-threaded solution for checking status and getting outputs GRID Optimizations Reduction of the number of simulations (Daniel Wegmann) By MCMC. Promising results (~10 times less sims) 5 mio. simulations

  32. Geographic origin of human dispersal Ray et al. (2005) Genome Research

  33. Mutations surfing during a range expansion

  34. Mutations surfing during a range expansion • Some mutation can travel with the wave of advance • New mutations can reach high frequencies • More pronounced in small populations Klopfstein, Currat and Excoffier (2006) MBE 23(3): 482-490

  35. Selection ? Currat, Excoffier, Maddison, Otto, Ray, Whitlock and Yeaman (2006) Science 313:172a (2005) Science 509 (5741)

  36. Interactions among populations Interaction between modern humans and Neanderthals in Europe Currat & Excoffier(2004), PLoS Biol.

  37. Cane toad invasion in Australia Estoup, A., Baird, S. J. E., Ray, N., Currat, M., Cornuet, J.-M., Santos, F., Beaumont, M. A. and L. Excoffier. Combining genetic, historical and geographic data to reconstruct the dynamics of the bioinvasion of cane toad Bufo marinus. In prep

  38. Take-home message

  39. Thankyou!

More Related