1 / 70

Understanding Genetic Association Studies for Epidemiologists

Learn the analysis of genetic association studies, measures of association, quality control, and more in quantitative genetics. Discover how to detect false positives and correct for them. Explore models of genetic transmission and gene interactions.

rconley
Download Presentation

Understanding Genetic Association Studies for Epidemiologists

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Genetics for EpidemiologistsLecture 5: Analysis of Genetic Association Studies National Human Genome Research Institute U.S. Department of Health and Human Services National Institutes of Health National Human Genome Research Institute National Institutes of Health Teri A. Manolio, M.D., Ph.D.Director, Office of Population Genomics and Senior Advisor to the Director, NHGRI, for Population Genomics U.S. Department of Health and Human Services

  2. Topics to be Covered • Discrete traits and quantitative traits • Measures of association • Detecting/correcting for false positives • Genotyping quality control • Quantile-quantile (Q-Q) plots • Odds ratios: allelic and genotypic • Models of genetic transmission • Interactions: gene-gene, gene-environment

  3. Larson, G. The Complete Far Side. 2003.

  4. Quantitative Genetics “…concerned with the inheritance of those differences between individuals that are of degree rather than of kind…” Falconer and Mackay, Quantitative Genetics 1996.

  5. Quantitative Genetics “…concerned with the inheritance of those differences between individuals that are of degree rather than of kind…” Falconer and Mackay, Quantitative Genetics 1996.

  6. Quantitative Genetics “…concerned with the inheritance of those differences between individuals that are of degree rather than of kind…” Falconer and Mackay, Quantitative Genetics 1996.

  7. Quantitative Genetics “…concerned with the inheritance of those differences between individuals that are of degree rather than of kind…” Falconer and Mackay, Quantitative Genetics 1996.

  8. A a Inheritance Models in Single Gene Trait

  9. Inheritance Models in Single Gene Trait

  10. Inheritance Models in Single Gene Trait

  11. Inheritance Models in Single Gene Trait

  12. Inheritance Models in Single Gene Trait

  13. Inheritance Models in Single Gene Trait

  14. A x increase in height a x decrease in height Inheritance Models in Quantitative Trait

  15. Inheritance Models in Quantitative Trait

  16. Inheritance Models in Quantitative Trait

  17. Inheritance Models in Quantitative Trait

  18. Inheritance Models in Quantitative Trait

  19. Inheritance Models in Quantitative Trait

  20. Quantitative Traits with Published GWA Studies (16 - 34) • QT interval • Lipids and lipoproteins • Memory • Nicotine dependence • ORMDL3 expression • YKL-40 levels • Obesity, BMI, waist • Insulin resistance • Height • Bone mineral density • F-cell distribution • Fetal hemoglobin levels • C-Reactive protein • 18 groups of Framingham traits • Pigmentation • Uric Acid Levels • Recombination Rate

  21. Association of Alleles and Genotypes of rs1333049 (‘3049) with Myocardial Infarction Samani N et al, N Engl J Med 2007; 357:443-453.

  22. Association of Alleles and Genotypes of rs1333049 (‘3049) with Myocardial Infarction Samani N et al, N Engl J Med 2007; 357:443-453.

  23. -Log10 P Values for SNP Associations with Myocardial Infarction Samani N et al, N Engl J Med 2007; 357:443-453.

  24. Genome-Wide Scan for Type 2 Diabetes in a Scandinavian Cohort http://www.broad.mit.edu/diabetes/scandinavs/type2.html

  25. GWA Study of Serum Uric Acid Levels • Linear regression of inverse normalized levels against number of alleles • Additive model • Sex, age, age2 as covariates Li S et al, PLoS Genet 2007; 3:e194.

  26. Association of rs6855911 and Uric Acid Levels Li S et al, PLoS Genet 2007; 3:e194.

  27. Association Methods for Quantitative Traits • Linear regression of multivariable adjusted residual against number of alleles (Kathiresan,Nat Genet 2008; 40:189-97) • Linear regression of log transformed or centralized BMI against genotype (Frayling, Science 2007; 316:889-94) • Variance components based Z-score analysis of quantile normalized height (Sanna, Nat Genet 2008; 40:198-203)

  28. Ways of Dealing with Multiple Testing • Control family wise error rate (FWER): Bonferroni (α’ = α/n) or Sĭdák (α’ = 1- [1- α]1/n) • False discovery rate: proportion of significant associations that are actually false positives • False positive report probability: probability that the null hypothesis is true, given a statistically significant finding • Bayes factors analysis: avoids need for assessing genome-wide error rates but must identify reasonable alternative model Hogart CJ et al, Genet Epidemiol 2008; 32:179-85.

  29. Larson, G. The Complete Far Side. 2003.

  30. Quality Control of SNP Genotyping: Samples • Identity with forensic markers (Identifiler) • Blind duplicates • Gender checks • Cryptic relatedness or unsuspected twinning • Degradation/fragmentation • Call rate (> 80-90%) • Heterozygosity: outliers • Plate/batch calling effects Chanock et al, Nature 2007; Manolio et al Nat Genet 2007

  31. Quality Control of SNP Genotyping: SNPs • Duplicate concordance (CEPH samples) • Mendelian errors (typically < 1) • Hardy-Weinberg errors (often > 10-5) • Heterozygosity (outliers) • Call rate (typically > 98%) • Minor allele frequency (often > 1%) • Validation of most critical results on independent genotyping platform Chanock et al, Nature 2007; Manolio et al Nat Genet 2007

  32. Hardy-Weinberg Equilibrium • Occurrence of two alleles of a SNP in the same individual are two independent events • Ideal conditions: • random mating - no selection (equal survival) • no migration - no mutation • no inbreeding - large population sizes • gene frequencies equal in males and females)… • If alleles A and a of SNP rs1234 have frequencies p and 1-p, expected frequencies of the three genotypes are: Freq AA = p2 Freq Aa = 2p(1-p) Freq aa = (1-p)2 After G. Thomas, NCI

  33. Coverage, Call Rates, and Concordance of Perlegen and Affymetrix Platforms on HapMap Phase II GAIN Collaborative Group, Nat Genet 2007; 39:1045-51.

  34. Sample and SNP QC Metrics for Affymetrix 5.0 and 6.0 Platforms in GAIN Courtesy, J Paschall, NCBI

  35. Sample and SNP QC Metrics for Affymetrix 5.0 and 6.0 Platforms in GAIN Courtesy, J Paschall, NCBI

  36. Sample Heterozygosity in GAIN Courtesy, J Paschall, NCBI

  37. Sample Heterozygosity in GAIN Courtesy, J Paschall, NCBI

  38. Signal Intensity Plots for rs10801532 in AREDS http://www.ncbi.nlm.nih.gov/sites/entrez

  39. Signal Intensity Plots for rs4639796 in AREDS http://www.ncbi.nlm.nih.gov/sites/entrez

  40. Signal Intensity Plots for rs534399 in AREDS http://www.ncbi.nlm.nih.gov/sites/entrez

  41. Signal Intensity Plots for rs572515 in AREDS http://www.ncbi.nlm.nih.gov/sites/entrez

  42. Signal Intensity Plots for CD44 SNP rs9666607 Clayton DG et al, Nat Genet 2005; 37:1243-1246.

  43. Principal Component Analysis of Structured Population: First to Third Components Courtesy, G. Thomas, NCI

  44. Principal Component Analysis of Structured Population: Fourth and Fifth Components Courtesy, G. Thomas, NCI

  45. Influence of Relatedness on Principal Component Analysis Courtesy, G. Thomas, NCI

  46. Principal Component Analysis of Structured Population: Fourth and Fifth Components Courtesy, G. Thomas, NCI

  47. Principal Component Analysis of Structured Population: Fourth and Fifth Components Courtesy, G. Thomas, NCI

  48. Summary Points: Genotyping Quality Control • Sample checks for identity, gender error, cryptic relatedness • Sample handling differences can introduce artifacts but probably can be adjusted for • Association analysis is often quickest way to find genotyping errors • Low MAF SNPs are most difficult to call • Inspection of genotyping cluster plots is crucial!

  49. Quantile-Quantile Plot for Test Statistics, 390 Breast Cancer Cases, 364 Controls 205,586 SNPs λ = 1.03 Easton D et al, Nature 2007; 447:1087-1093.

  50. Observed and Expected Associations after Stage 2 of Breast Cancer GWA Easton D et al, Nature 2007; 447:1087-93.

More Related