770 likes | 979 Views
Molecular population genetics of adaptation from recurrent beneficial mutation. Joachim Hermisson and Pleuni Pennings, LMU Munich. How can genetic variation be maintained in a population in the face of positive selection?. Selective sweep with recombination.
E N D
Molecular population genetics of adaptation from recurrent beneficial mutation Joachim Hermisson and Pleuni Pennings, LMU Munich
How can genetic variation be maintained in a population in the face of positive selection?
Recurrent mutation Classical view: • Adaptive substitutions occur from a single mutational origin
Recurrent mutation Classical view: • Adaptive substitutions occur from a single mutational origin What happens if the same beneficial allele occurs recurrently in a population?
1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 Soft sweepfrom recurrent mutation frequency → time →
Is recurrent mutation relevant? • What is the probability of a soft sweep under recurrent mutation? • What is the impact on patterns of neutral polymorphism?
Model • Haploid population of constant size Ne • At selected locus: recurrent mutation of rate uto a beneficial allele (or a class of equivalent alleles) with selective advantage s • Scaled values: q = 2Ne u ,a = 2Ne s, R = 2Ne r • Generation update: Wright-Fisher model (fitness weighted multinomial sampling)
Coalescent viewGenealogy of a sample from a linked locus t +1 t • What can happen one generation back in time? 1- xt xt n lines
Coalescent viewCoalescence of two lines t +1 t • Rate per generation: 1- xt xt
Coalescent viewRecombination t +1 t • Rate per generation: 1- xt xt
Coalescent viewNew mutation at selected site t +1 t • Rate per generation: 1- xt xt
Coalescent view Problem: Rates for • coalescence • recombination • beneficial mutation depend on the frequency x of the selected allele: stochastic path
Coalescent viewClassic case: Coalescence and recombination • Probability for multiple haplotypes in a sample after a sweep due to recombination: (Higher orders: Etheridge, Pfaffelhuber, Wakolbinger) • small for large a (strong selection makes broad sweep patterns)
Coalescent viewCoalescence and mutation, sample of size 2 Probability for coalescence before mutation (single haplotype)
Coalescent viewCoalescence and mutation, sample of size 2 Probability for coalescence before mutation (single haplotype)
Coalescent viewCoalescence and mutation, sample of size 2 Probability for coalescence before mutation (single haplotype)
Coalescent viewCoalescence and mutation, sample of size 2 Probability for coalescence before mutation (single haplotype)
Coalescent viewCoalescence and mutation, sample of size 2 Probability for coalescence before mutation (single haplotype)
Coalescent viewCoalescence and mutation, sample of size 2 Probability for single or multiple haplotypes: T1: average time to the first coalescence or mutation-event
Coalescent viewCoalescence and mutation, sample of size 2 Sampling at time of fixation: 0 <T1 < Tfix
Coalescent viewCoalescence and mutation, sample of size 2 General: sampling Tobs generations after fixation: extra factor can be ignored for Tobs << Ne
Coalescent viewCoalescence and mutation, sample of size 2 Sampling at time of fixation: 0 <T1 < Tfix Tfix / Ne≈ 4 log(a) / a , a = 2Ne s(scaled selection strength)
Coalescent viewCoalescence and mutation, sample of size 2 Simulation results (θ = 0.4)
Coalescent viewCoalescence and mutation, sample of size 2 Fora > 500 : Tfix / Ne<< 1, thus • Corresponds to approximation:
Coalescent viewCoalescence and mutation, sample of size n Continuous time and time rescaling: Neutral coalescent !
Coalescent viewCoalescence and mutation, sample of size n • Problem independent of the path xt and all selection parameters
Coalescent viewCoalescence and mutation, sample of size n • Problem independent of the path xt and all selection parameters • Coalescent of the infinite alleles model • Forward in time: “Hoppe urn” or Yule process with immigration
Coalescent viewCoalescence and mutation, sample of size n • Problem independent of the path xt and all selection parameters • Coalescent of the infinite alleles model • Forward in time: “Hoppe urn” or Yule process with immigration The sampling distribution of ancestral haplotypes can be approximated by the distribution of family sizes in a Hoppe urn or a Yule process with immigration • Solved problem
ResultsEwens sampling formula • Probability for k haplotypes, occurring n1,…, nk times in a sample of size n:
ResultsEwens sampling formula • Probability for more than one ancestral haplotype in a sample (“soft sweep”):
Ewens approximation, sample size n = 20 >4 haplos 100% 4 haplos 80% 3 haplos 60% 2 haplos 40% 1 haplo 20% 0% q = 1 q = 4 q = 0.4 q = 0.04 q = 0.004 ResultsProbability of a soft sweep
ResultsProbability of a soft sweep Simulation (2Ne s = 10000, n = 20) >4 haplos 100% 4 haplos 80% 3 haplos 60% 2 haplos 40% 1 haplo 20% 0% q = 1 q = 4 q = 0.4 q = 0.04 q = 0.004
ResultsProbability of a soft sweep Simulation (2Ne s = 10000, n = 20) >4 haplos 100% 4 haplos 80% 3 haplos 60% 2 haplos 40% 1 haplo 20% 0% q = 1 q = 4 q = 0.4 q = 0.04 q = 0.004 Probability for multiple haplotypes> 5%forq > 0.01 >95% for q > 1
ResultsFrequency of major haplotype 0.5 Sample size 10 0.4 α =100 α =1000 0.3 α =10000 prediction 0.2 0.1 0 5/10 6/10 7/10 8/10 9/10
When should we expect soft sweeps?Multiple haplotypes due to recurrent beneficial mutations • Strong dependence on the mutation rate • More than 5% for q> 0.01 • E.g. African D. melanogaster: q≈ 0.05 (Li / Stephan 2006) • About 16% of all single-site adaptations “soft” • Particularly relevant for • Large populations (e.g. bacteria) • Adaptive (partial) loss-of-function mutations
Soft sweeps in data? • Drosophila • Schlenke and Begun (Genetics 2005): LD pattern at 3 immunity receptor genes in Californian D. simulans • Humans • Multiple origin of FY-0 Duffy allele (loss of function) • Plasmodium • Multiple origins of pyrimethamine resistance mutations
Generality of the resultmigration instead of mutation • Beneficial alleles enter by recurrent migration at rate M = 2Ne m from a genetically diverged source population • Coalescent analysis with migration rate
Generality of the resultmigration instead of mutation • Beneficial alleles enter by recurrent migration at rate M = 2Ne m from a genetically diverged source population • Coalescent analysis with migration rate • Directly proportional to coalescence rate (no factor 1- xt) • Approximation holds exactly in this case
Generality of the resultmigration instead of mutation M = 0.4 q = 0.4 a