230 likes | 333 Views
Tracking the genetic legacy of past human populations through the grid. Nicolas Ray University of Geneva & UNEP/GRID-Europe. Swiss Grid Day, Bern, November 26 th 2009. Adapted from Cavalli-Sforza & Feldman, 2003. Human migrations. [12,000]. [55,000]. Homo sapiens sapiens.
E N D
Tracking the genetic legacy of past human populationsthrough the grid Nicolas Ray University of Geneva & UNEP/GRID-Europe Swiss Grid Day, Bern, November 26th 2009
Adapted from Cavalli-Sforza & Feldman, 2003 Human migrations [12,000] [55,000] Homo sapiens sapiens
Why aiming at a good demographic model • 1. Better understand human evolution • Origin of modern human (when, where, how many?) • Relationship with other members of the Homo genus • 2. Distinguish between the effect of demography and those of selection (biomedical applications)
Observed patterns of genetic diversity in contemporary populations A complex past demography fluctuation in effective pop. size substructure migrations Gene-specific factors mutations recombination selection
Adapted from Cavalli-Sforza & Feldman, 2003 A complex demography [10,000] demographic and spatial expansions [55,000] population bottlenecks secondary contacts population isolation fast migration events
SPLATCHE SPatiaLAnd Temporal Coalescences in Heterogeneous Environment (http://cmpg.unibe.ch/software/splatche)
Carrying capacity low high From environment to demography Spatial resolution: 100 km
From environment to demography Friction low high
Cell or deme Pop. size time Demographic simulations stepping-stone model (cellular automata)
Demography and spatial expansion Population density
Summary statistics • Within population: • S, p • Between populations • Pairwise FST • Global FST • Globally • S, p ACCTAGTACAATCGGTAATGCCATTGGT Modèle de mutation Mutation Simulated genealogy TCCTTGTA…ATTGGT ACCGAGTA…GTTGGT
ApproximateBayesian Computations (ABC) Computational issues Computer clusters 1-10 mio. UBELIX (>500 nodes) Zooblythii (~40 nodes)
Computational issues A fully spatially-explicit model using 500 loci in 800 individuals: 10 CPU-years Adding long-distance dispersal: 20 CPU-years
SPLATCHE on the grid • early 2005: joined the Biomed VO of the EGEE project • mid 2005: tested on GILDA test bed, and deployed on the Grid • since mid 2006: production mode and optimization
GRID Statistical tools Posterior distribution of demographic/genetic parameters of interest Use of SPLATCHE on the grid N simulations • Independent simulations: • the more CPUs, the better • job failures are not that bad
Submission time multi-threaded application using up to 30 RBs (used for the WISDOM project) Fetching time of job outputs in-house multi-threaded solution for checking status and getting outputs GRID Optimizations Reduction of the number of simulations (Daniel Wegmann) By MCMC. Promising results (~50 times less sims) 5 mio. simulations
Geographic origin of human dispersal Ray et al. (2005) Genome Research
Interactions among populations Interaction between modern humans and Neanderthals in Europe Currat & Excoffier(2004), PLoS Biol.
Cane toad invasion in Australia Estoup, A., Baird, S. J. E., Ray, N., Currat, M., Cornuet, J.-M., Santos, F., Beaumont, M. A. and L. Excoffier. Combining genetic, historical and geographic data to reconstruct the dynamics of the bioinvasion of cane toad Bufo marinus. Submitted