1 / 60

Genetics for Epidemiologists Lecture 4: Genetic Association Studies

National Human Genome Research Institute. Genetics for Epidemiologists Lecture 4: Genetic Association Studies. U.S. Department of Health and Human Services National Institutes of Health National Human Genome Research Institute. National Institutes of Health.

slade
Download Presentation

Genetics for Epidemiologists Lecture 4: Genetic Association Studies

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. National Human Genome Research Institute Genetics for EpidemiologistsLecture 4: Genetic Association Studies U.S. Department of Health and Human Services National Institutes of Health National Human Genome Research Institute National Institutes of Health Teri A. Manolio, M.D., Ph.D.Director, Office of Population Genomics and Senior Advisor to the Director, NHGRI, for Population Genomics U.S. Department of Health and Human Services

  2. Topics to be Covered • Case-control and cohort studies in genomic research • Candidate gene studies • Genome-wide association studies • Randomized/experimental designs

  3. Collins FS, Nature 2004; 429:475-77.

  4. Desirable Characteristics of Large US Cohort Study • Large sample size • Full representation of US minority groups • Broad range of ages • Broad range of genetic backgrounds and environmental exposures • Family-based recruitment for at least part of cohort to control for population stratification • Broad array of clinical and laboratory data, regular follow up for events, additional exposure assessment After Collins FS, Nature 2004; 429:475-77.

  5. Desirable Characteristics of Large US Cohort Study (continued) • Technologically advanced dietary, lifestyle, and environmental exposure data • Collection and storage of biological specimens • Sophisticated data management system • Broad access to materials and data • Goals should not be “hypothesis-limited” • Comprehensive community engagement from the outset • State of the art (?dynamic) consent to allow multiple uses of data and regular feedback After Collins FS, Nature 2004; 429:475-77.

  6. Larson, G. The Complete Far Side. 2003.

  7. Manolio TA et al. Nature 2006; 7:812-820.

  8. Willett WC et al. Nature 2007; 445:257-258.

  9. Collins FS et al. Nature 2007; 445:259.

  10. Pros and Cons of Case-Control Studies Advantages • May be the only way to study rare diseases or those of long latency • Existing records can occasionally be used if risk factor data collected independent of disease status • Can study multiple etiologic factors simultaneously • May be less time-consuming and expensive • If assumptions met, inferences are reliable

  11. Pros and Cons of Case-Control Studies Disadvantages • Relies on recall or records for information on past exposures; validation can be difficult or impossible • Selection of appropriate comparison group may be difficult • Multiple biases may give spurious evidence of association between risk factor and disease • Usually cannot study rare exposures • Temporal relationship between exposure and disease can be difficult to determine

  12. “But,” they say, “This Is Genetics!”(you dumb epidemiologist)“This Is Different!” • Genes are measured the same way in cases and controls • Information on key exposure is easy to validate • No recall or reporting involved • Temporal relationship between genes and disease is piece of cake “BUT,” I SAY, • Bias-free ascertainment of cases and controls is still major concern; cases in most clinical series unlikely to be representative • Assessment of risk modifiers or gene-environment interactions is likely to be incomplete or flawed

  13. Appreciation of Weaknesses of Case-Control Studies http://www.mainlesson.com/display.php3?author=treadwell&book=primer&story=chickenlittle Larson, G. The Complete Far Side. 2003.

  14. Candidate Genes

  15. Genetic Studies in Unrelated Individuals (pre-2005): Candidate Gene Studies • Goal: characterize candidate genes and variants related to disease • Not typically intended to “find genes,” generally begun after disease-related variants identified • Assess generalizability of family-based observations (genetic heterogeneity) • Assess importance of allelic variation at population level (PAR, penetrance) • Identify modification of genetic association by environmental factors (GxE interaction)

  16. Population Studies of Genetic Variants: Angiotensin I-Converting Enzyme (ACE) Larsen: Williams Textbook of Endocrinology, 10th ed., 2003

  17. ACE Gene Identification • cDNA sequence determined for human testicular ACE, identical from residue 27 to C terminus to C-terminal domain of endothelial ACE (Ehlers et al, PNAS 1989) • Assigned to chromosome 17q23 by in situ hybridization (Mattei et al, Cytogenet Cell Genet 1989) • Linked to elevated blood pressure in rat model of hypertension (Jacob et al, Cell 1991) • Mapped to human chromosome 17q22-q24 (Jeunemaitre et al, Nat Genet 1992)

  18. ACE Gene Polymorphisms • Insertion/deletion polymorphism identified through restriction fragment length polymorphism (RFLP) analysis • Two alleles results from 250-bp insertion in intron 16; allele frequencies = 0.41 for I allele and 0.59 for D allele • Accounted for 47% variance in serum ACE in 80 subjects Rigat et al, J Clin Invest 1990; 86:1343-46.

  19. Nature 1992; 359:641-44.

  20. Frequency of ACE Genotypes in 1,304 MI Cases and Controls OR = 1.34, p = 0.007 104 309 197 200 390 104 Cambien et al, Nature 1992; 359:641-44.

  21. Frequency of ACE Genotypes in 1,304 MI Cases and Controls Low Risk High Risk OR = 3.2 [1.7,5.9] OR = 1.1 [0.9,1.5] 41 154 372 38 46 159 390 143 Cambien et al, Nature 1992; 359:641-44.

  22. Age-Adjusted Odds on Hypertension by ACE ID/DD Genotype and Sex after O’Donnell C et al, Circulation 1998; 97:1766-72.

  23. Number of New, Significant Gene-Disease Associations by Year, 1984 - 2000 Hirschhorn J et al, Genet Med 2002; 4:45-61.

  24. Of 600 Gene-Disease Associations, Only 6 Significant in > 75% of Identified Studies Hirschhorn J et al, Genet Med 2002; 4:45-61. Hirschhorn J et al, Genet Med 2002; 4:45-61.

  25. Reports For and Against Associations of Variants with Carotid Atherosclerosis Manolio et al, ATVB 2004; 24:1567-1577.

  26. Summary Points: Candidate Gene Studies • Initial enthusiasm markedly damped by failure to replicate • Can probably find study or story that will fit almost any candidate to any disease/trait • Understanding of genome structure and function, and of pathophysiologic mechanisms, just too preliminary to project more than a handful of plausible candidates at present

  27. Larson, G. The Complete Far Side. 2003.

  28. 2007 second quarter 2007 third quarter 2007 first quarter 2007 fourth quarter 2006 2008 first quarter 2005 Manolio et al., J Clin Invest 2008; 118:1590-605.

  29. 2007: The Year of GWA Studies Pennisi E, Science 2007; 318:1842-43.

  30. Macular Degeneration • Exfoliation Glaucoma • Lung Cancer • Prostate Cancer • Breast Cancer • Colorectal Cancer • Neuroblastoma • Crohn’s Disease • Celiac Disease • Gallstones • Irritable Bowel Syndrome • QT Prolongation • Coronary Disease • Stroke • Hypertension • Atrial Fibrillation/Flutter • Coronary Spasm • Lipids and Lipoproteins • Parkinson Disease • Amyotrophic Lat. Sclerosis • Multiple Sclerosis • Prog. Supranuclear Palsy • MS Interferon-β Response • Alzheimer’s Disease • Cognitive Ability • Memory • Restless Legs Syndrome • Nicotine Dependence • Methamphetamine Depend. • Neuroticism • Schizophrenia • Bipolar Disorder • Family Chaos • Rheumatoid Arthritis • Systemic Lupus Erythematosus • Psoriasis • HIV Viral Setpoint • Childhood Asthma • Type 1 Diabetes • Type 2 Diabetes • Diabetic Nephropathy • End-Stage Renal Disease • Obesity, BMI, Waist, IR • Height • Osteoporosis • F-Cell Distribution • Fetal Hgb Levels • C-Reactive Protein • 18 groups of Framingham Traits • Pigmentation • Uric Acid Levels • Recombination Rate Diseases and Traits with Published GWA Studies (n = 53, 5/9/08)

  31. NHGRI Catalog of GWA Studies: http://www.genome.gov/gwastudies/

  32. NHGRI Catalog of GWA Studies: http://www.genome.gov/gwastudies/ • First author/Data/Journal/Study • Disease/Trait • Initial Sample Size • Replication Sample Size • Region • Gene • Strongest SNP – Risk Allele • Risk Allele Frequency in Controls • P-value • OR per copy [95% CI] • Platform and SNPs passing QC

  33. NHGRI Catalog of GWA Studies: http://www.genome.gov/gwastudies/ • First author/Data/Journal/Study • Disease/Trait • Initial Sample Size • Replication Sample Size • Region • Gene • Strongest SNP – Risk Allele • Risk Allele Frequency in Controls • P-value • OR per copy [95% CI] • Platform and SNPs passing QC

  34. NHGRI Catalog of GWA Studies: http://www.genome.gov/gwastudies/

  35. Method for interrogating all 10 million variable points across human genome • Variation inherited in groups, or blocks, so not all 10 million points have to be tested • Blocks are shorter (so need to test more points) the less closely people are related • Technology now allows studies in unrelated persons, assuming 5,000 – 10,000 base pair lengths in common (300,000 – 1,000,000 markers) What is a Genome-Wide Association Study?

  36. Association of Alleles and Genotypes of rs1333049 (‘3049) with Myocardial Infarction Samani N et al, N Engl J Med 2007; 357:443-453.

  37. Nicotine Dependence among Smokers Bierut LJ et al, Hum Molec Genet 2007; 16:24-35.

  38. P Values of GWA Scan for Age-Related Macular Degeneration Klein et al, Science 2005; 308:385-389.

  39. Genome-Wide Scan for QTc Interval Arking D et al, Nat Genet 2006; 38:644-651.

  40. Genome-Wide Scan for Type 2 Diabetes in a Scandinavian Cohort http://www.broad.mit.edu/diabetes/scandinavs/type2.html

  41. Wellcome Trust Genome-Wide Association Study of Seven Common Diseases WTCCC, Nature 2007; 447:661-678.

  42. “There have been few, if any, similar bursts of discovery in the history of medical research…” Hunter DJ and Kraft P, N Engl J Med 2007; 357:436-439.

  43. Lessons Learned from Initial GWA Studies

  44. Unique Aspects of GWA Studies • Permit examination of inherited genetic variability at unprecedented level of resolution • Permit "agnostic" genome-wide evaluation • Once genome measured, can be related to any trait • Most robust associations in GWA studies have not been with genes previously suspected of association with the disease • Some associations in regions not even known to harbor genes “The chief strength of the new approach also contains its chief problem: with more than 500,000 comparisons per study, the potential for false positive results is unprecedented.” Hunter DJ and Kraft P,N Engl J Med 2007; 357:436-439.

  45. Larson, G. The Complete Far Side. 2003.

  46. Bonferroni correction: most common, typically p < 10-7 or 10-8 • False discovery rate: proportion of significant associations that are actually false positives • False positive report probability: probability that the null hypothesis is true, given a statistically significant finding • Replication, replication, replication Ways of Dealing with Multiple Testing

  47. Chanock S, Manolio T, et al, Nature 2007; 447:655-660.

  48. Replication Strategy for Prostate Cancer Study in CGEMS Initial Study 1,150 cases / 1,150 controls >500,000 Tag SNPs Replication Study #1 3,000 cases / 3,000 controls ~24,000 SNPs Replication Study #2 2,400 cases / 2,400 controls ~1,500 SNPs 200+ New ht-SNPs Replication Study #3 2,500 cases / 2,500 controls 25-50 Loci Hoover R, Epidemiology 2007; 18:13-17.

  49. Replication, Replication, Replication Initial study: Sufficient description to permit replication • Sources of cases and controls • Participation rates and flow chart of selection • Methods for assessing affected status • Standard “Table 1” including rates of missing data • Assessment of population heterogeneity • Genotyping methods and QC metrics Replication study: • Similar population, similar phenotype • Same genetic model, same SNP, same direction • Adequately powered to detect postulated effect Chanock S, Manolio T, et al, Nature 2007; 447:655-660.

  50. Replication Strategy in Easton Breast Cancer Study • ABCFS • BCST • COPS • GENICA • HBCS • HBCP • TBCS • KConFab/AOCS • KBCP • LUMCBCS • MCBCS • MCCS • MEC-W • MEC-J • NHS • PBCS • RBCS • SASBAC • SEARCH2 • SEARCH3 • SBCP • SBCS • CNIOBCS • USRT Easton et al, Nature 2007; 447:1087-93.

More Related