1 / 36

Lecture 4: Testing for Departures from Hardy-Weinberg Equilibrium

Lecture 4: Testing for Departures from Hardy-Weinberg Equilibrium. January 17, 2014. Last Time. Introduction to statistical distributions Estimating allele frequencies Introduction to Hardy-Weinberg Equilibrium. Today. Hardy-Weinberg Equilibrium Continued

caspar
Download Presentation

Lecture 4: Testing for Departures from Hardy-Weinberg Equilibrium

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lecture 4: Testing for Departures from Hardy-Weinberg Equilibrium January 17, 2014

  2. Last Time • Introduction to statistical distributions • Estimating allele frequencies • Introduction to Hardy-Weinberg Equilibrium

  3. Today • Hardy-Weinberg Equilibrium Continued • Using Hardy-Weinberg: Estimating allele frequencies for dominant loci • Hypothesis testing

  4. What is a Population? • Operational definition: an assemblage of individuals • Population genetics definition: a collection of randomly mating individuals • Why does this matter?

  5. Frequency of A1A1 (P) Frequency of A1A2 (H) Frequency of A2A2 (Q) Hardy-Weinberg Law • Hardy and Weinberg came up with this simultaneously in 1908 • After one generation of random mating, single-locus genotype frequencies can be represented by a binomial (with 2 alleles) or a multinomial function of allele frequencies

  6. Hardy-Weinberg Equilibrium • After one generation of random mating, genotype frequencies remain constant, as long as allele frequencies remain constant • Provides a convenient Neutral Model to test for departures from assumptions • Allows genotype frequencies to be represented by allele frequencies: simplification of calculations

  7. New Notation Genotype Frequency AA P Aa H aa Q Allele Frequency A p a q

  8. Genotypes: Phenotypes: Alleles: : 4 : 10 : 6 : 4 : 10 : 6 : A2=14 : A1=26 How does Hardy-Weinberg Work? • Reproduction is a sampling process • Example: Mountain Laurel at Cooper’s Rock Red Flowers: 5000 Pink Flowers: 3000 White Flowers: 2000 A1A1 A1A2 A2A2 Frequency of A1 = p = 0.65 Frequency of A2 = q = 0.35 What are expected numbers of phenotypes and genotypes in a sample of 20 trees? What are expected frequencies of alleles in pollen and ovules?

  9. What will be the genotype and phenotype frequencies in the next generation? What assumptions must we make?

  10. Hardy-Weinberg Assumptions • Diploid • Large population • Random Mating: equal probability of mating among genotypes • No mutation • No gene flow • Equal allele frequencies between sexes • Nonoverlapping generations

  11. Graphical Representation of Hardy-Weinberg Law (p+q)2 =p2 + 2pq + q2 = 1

  12. Relationship Between Allele Frequencies and Genotype Frequencies under Hardy-Weinberg

  13. A(p) a(q) AA (p2) Aa (pq) A (p) aA (qp) aa (q2) a (q) Hardy-Weinberg Law and Probability p2 + 2pq + q2 = 1

  14. Alleles occur in gamete pool at same frequency as in adults • Probability of two alleles coming together to form a zygote is A B U Pollen Gametes Ovule Gametes What about a 3-Allele System? • Equilibrium established with ONE GENERATION of random mating • Genotype frequencies remain stable as long as allele frequencies remain stable • Remember assumptions! A1 (p) A2 (q) A3 (r) A1A1 = p2 A1A2 = 2pq A1 (p) A2A3 = 2qr A1A3 = 2pr A2 (q) A2A2 = q2 A3A3 = r2 A3 (r) From Neal, D. 2004. Introduction to Population Biology.

  15. Genotype Frequencies Under Hardy-Weinberg • Frequency of heterozygotes is maximum at intermediate allele frequencies

  16. At extreme allele frequencies, most copies of the minor allele are in heterozygotes, not homozygotes Recessive alleles are “hidden” from selection

  17. Frequencies of genotypes can be predicted from allele frequencies following one generation of random mating • Allele frequencies remain constant. • Why?

  18. Derivation of Hardy-Weinberg from Genotype Frequencies Aa x Aa

  19. Derivation of Hardy-Weinberg from Genotype Frequencies Total 2pq q2 1 p2

  20. Codominant Codominant locus locus Dominant locus Dominant locus - - A1A1 A1A2 A2A2 A2A2 A1A1 A1A2 + + Assumes Hardy-Weinberg Equilibrium! How do we estimate genotype frequencies for dominant loci? • First, get genotype frequency for recessive homozygote • frequency of A2A2 = Z=

  21. Example of calculating allele and genotype frequencies for dominant loci • Linanthus parryi is a desert annual with white and blue flower morphs, controlled by a single locus with two alleles • Blue is dominant to white: Blue Flowers: 750 White Flowers: 250 B1B1 andB1B2 B2B2 • Calculate p, q, X, Y, and Z

  22. Is this population in Hardy-Weinberg Equilibrium?

  23. Codominant Codominant locus locus Dominant locus Dominant locus - - A1A1 A1A2 A2A2 A2A2 A1A1 A1A2 + + Variance of Allele Frequency under Dominance • Frequency of dominant allele cannot be directly estimated from phenotypes (A1A1 is identical to A1A2) • Frequency of dominant allele (p) is estimated from frequency of recessive (q) • Variance of this estimate is therefore • Not the same as V(q)!

  24. Formula for binomial variance Derivation of Variance for Dominant Biallelic Locus • By definition: Variance of allele frequency for recessive allele at dominant locus

  25. Maximum Variance, Codominance Maximum Variance, Dominance p = 0.5 p = 0.125 Comparison of codominant and dominant variances Variance of allele frequency for codominant locus Variance of allele frequency for recessive allele at dominant locus Variance of q Variance of q Allele Frequency (q) Allele Frequency (q) Errors in genotype frequency estimates magnified at low allele frequencies

  26. Testing for Departures from Hardy-Weinberg Equilibrium

  27. Hypothesis Testing: Frequentist Approach • Define a null hypothesis, H0: • The probability of getting heads on each flip of a coin is p = 0.5 • Find the probability distribution for observing data under the null hypothesis (use binomial probablity distribution here) • Calculate the p-value, which is the probability of observing a result as extreme or more extreme if the null hypothesis is correct. • Reject the null hypothesis if the p-value is smaller than an arbitrarily chosen level of Type I statistical error (i.e., the probability of rejecting H0, when it is actually correct).

  28. Departures from Hardy-Weinberg • Chi-Square test is simplest (frequentist) way to detect departures from Hardy-Weinberg • Compare calculated Chi-Square value versus “critical value” to determine if a significant departure is supported by the data

  29. Meaning of P-value • Probability of a Chi-square value of the calculated magnitude or greater if the null hypothesis is true • Critical values are not magical numbers • Important to state hypotheses correctly • Interpret results within parameters of test • p<0.05: The null hypothesis of no significant departure from Hardy-Weinberg equilibrium is rejected.

  30. Alternatives to Chi-Square Calculation • If expected numbers are very small (less than 5), Chi-square distribution is not accurate • Exact tests are required if small numbers of expected genotypes are observed • Essentially a sample-point method based on permutations • Sample space is too large to sample exhaustively • Take a random sample of all possible outcomes • Determine if observed values are extreme compared to simulated values • Fisher’s Exact Test in lab next time

  31. Expected Heterozygosity If a population is in Hardy-Weinberg Equilibrium, the probability of sampling a heterozygous individual at a particular locus is the Expected Heterozygosity: • 2pq • for 2-allele, 1 locus system • OR • 1-(p2 + q2) or 1-Σ(expected homozygosity) • more general: what’s left over after calculating expected homozygosity Homozygosity is overestimated at small sample sizes. Must apply correction factor: Correction for bias in parameter estimates by small sample size

  32. Maximum Expected Heterozygosity • Expected heterozygosity is maximized when all allele frequencies are equal • Approaches 1 when number of alleles = number of chromosomes • Applying small sample correction factor: Also see Example 2.11 in Hedrick text

  33. Observed Heterozygosity • Proportion of individuals in a population that are heterozygous for a particular locus: Where Nij is the number of diploid individuals with genotype AiAj, and i ≠ j, And Hij is frequency of heterozygotes with those alleles • Difference between observed and expected heterozygosity will become very important soon • This is NOT how we test for departures from Hardy-Weinberg equilibrium!

  34. Alleles per Locus • Na: Number of alleles per locus • Ne: Effective number of alleles per locus If all alleles occurred at equal frequencies, this is the number of alleles that would result in the same expected heterozygosity as that observed in the population

  35. Calculate He, Na and Ne Example: Assay two microsatellite loci for WVU football team (N=50) Locus A Locus B

  36. Measures of Diversity are a Function of Populations and Locus Characteristics Assuming you assay the same samples, order the following markers by increasing average expected values of Ne and HE: RAPD SSR Allozyme

More Related