1 / 52

Topic #7 Single-Locus Association Studies: Case-Control Studies

Topic #7 Single-Locus Association Studies: Case-Control Studies. University of Wisconsin Genetic Analysis Workshop June 2011. Outline. Case-Control Study: Two-allele, single locus model Alternative Tests for Association Quantitative Outcomes: Two-allele, single locus model

sugar
Download Presentation

Topic #7 Single-Locus Association Studies: Case-Control Studies

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Topic #7Single-Locus Association Studies:Case-Control Studies University of Wisconsin Genetic Analysis Workshop June 2011

  2. Outline • Case-Control Study: • Two-allele, single locus model • Alternative Tests for Association • Quantitative Outcomes: • Two-allele, single locus model • Alternative Tests • Multiple Testing (Topic #8):

  3. Basic Setup

  4. Case-Control Example (Poirier et al. 1993)

  5. Tests (Models) of Association • Genotype: Distribution of 3 genotypes differs in the two groups (unstructured alternative) Standard c2 on 2df • Recessive: Relative frequency of A1A1 differs in two groups c2 on 1df • Dominant: Relative frequency of A2A2 differs in two groups c2 on 1df

  6. Case-Control Example: Genotype Test Test #1: Compare genotype frequency in cases and controls Test: c2(2df) = 27.7; p < 10-5

  7. Case-Control Example: Recessive Test Test #2: Rate of E4/E4 in cases and controls Test: c2 (1df) = 5.46; p =.019

  8. Case-Control Example: Dominant Test Test #3: Compares rate of +/+ in Cases and Controls Test: c2(1df) = 27.3; p < 10-6

  9. Simple Association: More Tests • Trend (Cochran-Armitage): Regress proportion of cases on # of risk alleles (here E4) • Allele Test: Count alleles rather than individuals (assumes HWE) • Case-only design: Test whether cases are in HWE • Logistic Model

  10. Trend Test (Cochran & Armitage) Test #4: Cochran-Armitage Trend Test Test: c2(1df) = 25.3; p < 10-5

  11. Allele Test Test #5: Allele Frequency Comparison Test: c2(1df) = 26.8; p < 10-6 Sasieni, P. D. (1997). From genotypes to genes: Doubling the sample size. Biometrics, 53(4), 1253-1261.

  12. Case-Only Design: Test for HWE Test #6: Departure from HWE Test: c2(1df) = 0.4; p = .52 (Little power for multiplicative model)

  13. Summary of 6 Tests

  14. And there are more: A 7th (and later 8th!) test • Log-additive/logistic:

  15. Advantage of Logistic Framework • Can easily accommodate covariates • Can accommodate alternative models (e.g., dominance or recessive models) with dummy variables • Test of H0: bi = 0 is very nearly same as allele test

  16. Genetic Determinants of Human Ageing and Longevity Project • Aim: • Identify genetic variants associated with extreme longevity • Basic Design: • 1200 cases (1905) and 800 controls (MADT) • Candidate-gene approach: 168 genes • Genotyping: • 1536 SNPs using Illumina’s Golden Gate Array

  17. Genes/SNPs Used in Workshop

  18. Summary of plink Data Cleaning of GCLC_Clean • Start: 1200 Cases, 800 Controls, 13 SNPs • Eliminate: • 293 (103 cases/90 controls) individuals with > 10% missing • 1 SNP eliminated because > 10% missing • 1 SNP fail HWE at p < .001 • 1 SNP eliminated due to low MAF • Final sample: 997 cases, 710 controls and 13 SNPs in GCLC

  19. plink Implementation of Association Tests • Basic association test (allelic): plink --file gclc_clean --assoc (generates plink.assoc)

  20. plink Association Output (plink.assoc) CHR SNP BP A1 F_A F_U A2 CHISQ P OR 6 rs7742367 53469235 G 0.169 0.1472 A 2.906 0.08826 1.178 6 rs670548 53474948 G 0.3507 0.39 A 5.504 0.01898 0.8447 6 rs661603 53478066 G 0.4626 0.4085 A 9.752 0.001791 1.246 6 rs16883912 53481730 A 0.1093 0.09437 G 1.988 0.1585 1.177 6 rs572496 53485578 A 0.5005 0.4458 G 9.952 0.001607 1.246 6 rs617066 53491877 A 0.3296 0.2648 G 16.52 4.82e-005 1.365 6 rs2100375 53493434 A 0.3539 0.3077 G 7.936 0.004846 1.232 6 rs531557 53497954 T 0.4769 0.433 A 6.421 0.01128 1.194 6 rs16883966 53505685 G 0.05308 0.0346 A 6.506 0.01075 1.564 6 rs4712035 53509062 C 0.1745 0.1711 G 0.06685 0.796 1.024 6 rs2397147 53509546 G 0.432 0.388 A 6.608 0.01015 1.2 6 rs534957 53514310 G 0.3258 0.338 C 0.5596 0.4544 0.9464 6 rs675908 53521259 G 0.3246 0.3383 A 0.7009 0.4025 0.94 Highlighted nominally significant at p < .05

  21. plink Implementation of Association Tests • Basic association test (allelic): plink --file gclc_clean --assoc (generates plink.assoc) • Genetic model based tests (genotype, trend, domin, recess): plink --file gclc_clean --model (generates plink.model)

  22. Association ‘Model’ Tests for 13 GCLC SNPs Highlighted In Red, nominally significant at p < .05, In Blue, significant after Bonferroni correction p < .004 (i.e., 05/13)

  23. Low Frequency SNPs • Within the 13 GCLC SNPs, rs16883966had MAF < .05 (.049 in Danish 1905 and .037 in MADT) • For this SNP unable to compute test statistic for Genotype, Dominant, & Recessive models because of low cell frequencies (Exp < .05)

  24. plink Implementation of Association Tests • Basic association test (allelic): plink --file gclc_clean --assoc (generates plink.assoc) • Genetic model based tests (genotype, trend, domin, recess): plink --file gclc_clean --model (generates plink.model) • Fisher exact test (the 8th!): plink --file gclc_clean --fisher (generates plink.fisher) • Logistic: plink --file gclc_clean --logistic (generates plink.logistic)

  25. Association Tests/ORs

  26. Plot of GCLC SNPS’ –log(p)

  27. Single-locus Quantitative Model

  28. Reparameterized Single-locus Model

  29. Genotypic Values A2A2 A1A1 A1A2 u11 u12 u22

  30. Genotypic Values A2A2 A1A1 A1A2 u11 u12 u22 -a d a

  31. Genotypic Values A2A2 A1A1 A1A2 u11 u12 u22 -a d a d is dominance parameter; when d = 0, locus is additive

  32. Additive Genetic Variance

  33. Additive Genetic Variance Note: d contributes to additive variance whenever q is not equal to .5

  34. Dominance Genetic Variance Note: There is dominance variance only when d is not 0

  35. Some Examples (all q = .5)

  36. Complete Additivity Slope of regression line =a Additive genetic variance = regression variance 1 0 2

  37. Some Examples (all q = .5)

  38. Partial Dominance Slope of regression line = a Dominance = Residual Variance Additive genetic variance = regression variance 1 0 2

  39. Some Examples (all q = .5)

  40. Complete Dominance Dominance = Residual Variance Slope of regression line = a Additive genetic variance = regression variance 1 0 2

  41. Some Examples

  42. Some Examples

  43. Some Examples

  44. Some Conclusions • Dominance effects contribute to additive genetic variance • Even with complete Mendelian dominance, additive variance typically exceeds dominance variance (exception would be overdominance)

  45. Power Calculation in Quanto for Quantitative Trait • In a study of 1000 unrelated individuals, what is our power to detect a single locus effect? • Strength of genetic effect (R2g) • Risk allele frequency?

  46. Quanto G Power Calculation • Outcome/Design: • Continuous  Independent Individuals • Hypothesis: • Gene Only • Gene: • Allele Frequency .10 to .90 by .20 • Additive model • Outcome Model: • R2g = .001 to .019 by .002 • Power: • Sample Size = 1000 to 1000 by 0 • Type I error rate = .05, two-sided • Calculate:

  47. Computed Power for N=1000(Minor Allele = Risk Allele) % Variance Accounted For

  48. Association with a Quantitative Phenotype • Genotype: 10 SNP markers in the COMT gene, including rs4680 • Sample: 7235 participants in MCTFR longitudinal research • Phenotype: General externalizing composite (having an overall mean of ~ 0.0, SD ~ .36) plink --bfilecomt --phen ext.dat --mpheno 2 --missing-phenotype -99.0 --assoc –qt-means

  49. Output: plink.qassoc CHR SNP BP NMISS BETA SE R2 T P 22 rs4646312 18328337 7233 -0.003598 0.006141 4.747e-005 -0.5859 0.558 22 rs165656 18328863 7232 -0.01252 0.005983 0.0006056 -2.093 0.03637 22 rs165722 18329013 7235 -0.01346 0.005974 0.0007017 -2.254 0.02424 22 rs2239393 18330428 7233 -0.003556 0.006125 4.662e-005 -0.5806 0.5615 22 500437 18330763 7232 -0.004062 0.006127 6.079e-005 -0.663 0.5074 22 rs4680 18331271 7234 -0.01358 0.005973 0.0007139 -2.273 0.02305 22 rs4646316 18332132 7235 -0.002434 0.007201 1.58e-005 -0.3381 0.7353 22 rs165774 18332561 7235 0.009351 0.006543 0.0002823 1.429 0.153 22 rs174699 18334458 7235 -0.0124 0.01288 0.0001281 -0.9626 0.3358 22 rs165599 18336781 7233 -0.004997 0.006435 8.337e-005 -0.7765 0.4375 Highlighted: Nominally significant at p < .05

  50. Output: plink.qassoc.means (rs4680)

More Related