1 / 34

Lecture 25 : Tests of Neutrality

Lecture 25 : Tests of Neutrality. April 14, 2014. Last Time. Human origins Out of Africa hypothesis Neanderthal and Denisovan genomes Introgression into humans Signatures of selection. Today. Sequence data and quantification of variation Infinite sites model Nucleotide diversity ( π )

Download Presentation

Lecture 25 : Tests of Neutrality

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lecture 25 : Tests of Neutrality April 14, 2014

  2. Last Time • Human origins • Out of Africa hypothesis • Neanderthal and Denisovan genomes • Introgression into humans • Signatures of selection

  3. Today • Sequence data and quantification of variation • Infinite sites model • Nucleotide diversity (π) • Sequence-based tests of neutrality • Ewens-Watterson Test • Tajima’s D • Hudson-Kreitman-Aguade • Synonymous versus Nonsynonymous substitutions • McDonald-Kreitman

  4. The main power of neutral theory is it provides a theoretical expectation for genetic variation in the absence of selection.

  5. Equilibrium Heterozygosity under IAM • Frequencies of individual alleles are constantly changing • Balance between loss and gain is maintained • 4Neμ>>1: mutation predominates, new mutants persist, H is high • 4Neμ<<1: drift dominates: new mutants quickly eliminated, H is low

  6. Effects of Population Size on Expected Heterozgyosity Under Infinite Alleles Model (μ=10-5) • Rapid approach to equilibrium in small populations • Higher heterozygosity with less drift

  7. Fate of Alleles in Mutation-Drift Balance Generations from birth to fixation Time between fixation events • Time to fixation of a new mutation is much longer than time to loss

  8. Fate of Alleles in Mutation-Drift-Selection Balance Purifying Selection Which case will have the most alleles on average at any given time? What will this depend upon? Highest HE? Neutrality Balancing Selection/Overdominance

  9. Assume you take a sample of 100 alleles from a large (but finite) population in mutation-drift equilibrium. What is the expected distribution of allele frequencies in your sample under neutrality and the Infinite Alleles Model? A. B. C. 10 8 6 Number of Alleles 4 2 2 2 2 4 4 4 6 6 6 8 8 8 10 10 10 Number of Observations of Allele

  10. Black: Predicted from Neutral Theory White: Observed (hypothetical) Hartl and Clark 2007 Allele Frequency Distributions • Neutral theory allows a prediction of frequency distribution of alleles through process of birth and demise of alleles through time • Comparison of observed to expected distribution provides evidence of departure from Infinite Alleles model • Depends on f, effective population size, and mutation rate

  11. Ewens Sampling Formula Population mutation rate: index of variability of population: Probability the i-th sampled allele is new given i alleles already sampled: Probability of sampling a new allele on the first sample: Probability of observing a new allele after sampling one allele: Probability of sampling a new allele on the third and fourth samples: . Expected number of different alleles (k) in a sample of 2N alleles is: Example: Expected number of alleles in a sample of 4:

  12. Ewens Sampling Formula • Predicts number of different alleles that should be observed in a given sample size if neutrality prevails under Infinite Alleles Model • Small , E(n) approaches 1 • Large , E(n) approaches 2N • can be predicted from number of observed alleles for given sample size • Can also predict expected homozygosity (fe) under this model where E(n) is the expected number of different alleles in a sample of N diploid individuals, and  = 4Ne.

  13. Ewens-Watterson Test • Compares expected homozygosity under the neutral model to expected homozygosity under Hardy-Weinberg equilibrium using observed allele frequencies • Comparison of allele frequency distributions • fecomes from infinite allele model simulations and can be found in tables for given sample sizes and observed allele numbers

  14. fe Ewens-Watterson Test Example • Drosophila pseudobscura collected from winery • Xanthine dehydrogenase alleles • 15 alleles observed in 89 chromosomes • fHW = 0.366 • Generated fe by simulation: mean 0.168 Hartl and Clark 2007 How would you interpret this result?

  15. Most Loci Look Neutral According to Ewens-Watterson Test Expected Homozygosity fe Hartl and Clark 2007

  16. DNA Sequence Polymorphisms • DNA sequence is ultimate view of standing genetic variation: no hidden alleles • Is this really true? • What about back mutation? • Signatures of past evolution are contained in DNA sequence • Neutral theory presents null model • Departures due to: • Selection • Demographic events • Bottlenecks, founder effects • Population admixture

  17. Sequence Alignment • Necessary first step for comparing sequences within and between species • Many different algorithms • Tradeoff of speed and accuracy

  18. Quantifying Divergence of Sequences • Nucleotide diversity (π) is average number of pairwise differences between sequences where N is number of sequences in sample, pi and pj are frequency of sequences i and j in the sample, and πij is the proportion of sites that differ between sequences i and j

  19. A 5 10 15 20 25 30 35 B C Sample Calculation of π A->B, 1 difference A->C, 1 difference B->C, 2 differences On average, there are 18.67 polymorphisms per kb between pairs of haplotypes in the population

  20. Tajima’s D Statistic where m is length of sequence, and • Infinite Sites Model: each new mutation affects a new site in a sequence • Expected number of polymorphic sites in all sequences: where n is number of different sequences compared

  21. A 5 10 15 20 25 30 35 B C Sample Calculation of S Two polymorphic sites S=2

  22. Tajima’s D Statistic • Two different ways of estimating same parameter: • Deviation of these two indicates deviation from neutral expectations where V(d) is variance of d

  23. Tajima’s D Expectations • D=0: Neutrality • D>0 • Balancing Selection: Divergence of alleles (π) increases OR • Bottleneck: S decreases • D<0 • Purifying or Positive Selection: Divergence of alleles decreases OR • Population expansion: Many low frequency alleles cause low average divergence

  24. Balancing Selection Balancing selection   • Should increase nucleotide diversity () • Decreases polymorphic sites (S) initially. • D>0 ‘balanced’ mutation Neutral mutation Slide adapted from Yoav Gilad

  25. Recent Bottleneck • Rare alleles are lost • Polymorphic sites (S) more severely affected than nucleotide nucleotide diversity () • D>0 Standard neutral model

  26. Positive Selection and Purifying Selection sweep recovery  S s  s Time • Should decrease both nucleotide diversity () and polymorphic sites (S) initially. • S recovers due to mutation •  recovers slowly: insensitive to rare alleles • D<0 Advantageous mutation Neutral mutation Slide adapted from Yoav Gilad

  27. Rapid Population Growth will also result in an excess of rare alleles even for neutral loci Standard neutral model Rapid population size increase • Most alleles are rare • Nucleotide diversity () depressed • Polymorphic sites (S) unchanged or even enhanced : 4Neμ is large • D<0 Time Often two main haplotypes, some rare alleles Most alleles are rare Slide adapted from Yoav Gilad

  28. How do we distinguish these two forms of divergence (selection vs demography)?

  29. Hudson-Kreitman-AguadeTest • Divergence between species should be of same magnitude as variation within species • Provides a correction factor for mutation rates at different sites • Complex goodness of fit test • Perform test for loci under selection and supposedly neutral loci

  30. 3 8 8 20 Hudson-Kreitman-Aguade (HKA) test Neutral Locus Test Locus A Polymorphism Divergence 8/20 ≈ 3/8 Polymorphism: Variation within species Divergence: Variation between species Slide adapted from Yoav Gilad

  31. 3 8 19 20 Hudson-Kreitman-Aguade (HKA) test Neutral Locus Test Locus B Polymorphism Divergence 8/20 >> 3/19 Conclusion: polymorphism lower than expected in Test Locus B: Selective sweep? Slide adapted from Yoav Gilad

  32. http://www.nsf.gov/news/mmg/media/images/corn-and-teosinte_h1.jpghttp://www.nsf.gov/news/mmg/media/images/corn-and-teosinte_h1.jpg

  33. Teosinte Maize Maize w/TBR mutation http://www.nsf.gov/news/mmg/media/images/corn-and-teosinte_h1.jpg Mauricio 2001; Nature Reviews Genetics2, 376

  34. Lab exercise: test Teosinte-Branched Gene for signature of purifying selection in maize compared to Teosinte relative Compare to patterns of polymorphism and diversity in Alchohol Dehydrogenase gene HKA Example: Teosinte Branched

More Related