290 likes | 457 Views
An Environment-Wide Association Study (EWAS) on Type 2 Diabetes Mellitus. Chirag J. Patel et al., PLoS One, May 2010. First, some context. Hypothesis-driven vs. data-driven research Tension between these two forms of research crosses all scientific disciplines
E N D
An Environment-Wide Association Study (EWAS) on Type 2 Diabetes Mellitus Chirag J. Patel et al., PLoS One, May 2010
First, some context... • Hypothesis-driven vs. data-driven research • Tension between these two forms of research crosses all scientific disciplines • In recent years, there has been an explosion of data-driven research...why? 2 Kell, 2003
Computational Speed Over Time Kurzweil, 2010 3
Genome Sequencing Cost 4 NHGRI, 2012
The Age of Big Data 5 Lohr, 2012
Genome-Wide Association Study (GWAS) • Typically case-control study design • Examine associations of single-nucleotide polymorphisms (SNPs) with disease state • NCBI’s SNP Database lists 187,852,828 SNPs identified in human genome (June 2012) • GWAS typically examines 100,000’s of SNPs through use of DNA mircoarrays 6 Bush et al., PLOS Computational Biology, 2012
GWAS of systemic sclerosis 7 Radstake et al., 2010
T2D Prevalence in US Estimated Prevalence (%) Age Year 9 CDC, 2011
T2D Incidence in US Estimated Incidence (per 1000 ppl) 18 16 14 Age 12 18 - 44 10 8 45 - 64 6 65 – 79 4 2 0 80 85 90 95 00 05 10 Year 10 CDC, 2012
Introduction • Type 2 Diabetes (T2D) has complex etiology,involving genetics, lifestyle, and environment • GWAS identified multiple SNPs associated with T2D, but these don’t explain T2D trends • Standard environmental epidemiology approaches limited by narrow focus • Patel et al. propose first “Environment-Wide Association Study” (EWAS) to examine T2D using a large, nationally-representative dataset 11
Methods • Combined four NHANES datasets (1999-2006) • Rich cross-sectional data on demographics, chemical toxicants, pollutants, allergens, nutrients, fasting blood sugar, and self-reported medical history • By using NHANES weighting, results can be generalized to US population 12
Methods: Environment Scan • Omitted environmental factors with low variability (>90% of observations below detection limit). Also omitted factors only affecting specific subsets of population • Across all four NHANES cohorts: 543 environmental factors • 266 unique factors in total, with 157 factors found in more than one cohort • Log-transformed factors when necessary. Used z-score transformations to allow comparisons between factors 13
Methods: Case definition • Based on ADA guidelines: fasting blood glucose level ≥ 126 mg/dL • Did not distinguish T1D from T2D • Did not consider medication use or medical history 15 ADA, 2009
Methods: Primary Analysis • Logistic regression (accounting for NHANES weighting) to estimates associations of 266 unique environmental factors with case status • Estimated prevalence odds ratios • Ran regressions for each individual NHANES cohort and with data of all combined cohorts • Covariates: age, sex, BMI, ethnicity, and income/poverty ratio 16
Methods: False Discovery Rate (FDR) • Accounted for multiple hypothesis testing • FDR= proportion of "discoveries" (significant results) that are actually false positives • Less stringent than Bonferroni correction 17
Methods: False Discovery Rate (FDR) Alpha Level FDR 5 false discoveries 5 false discoveries FDR = α = 100 significant results 100 total tests FDR = 0.05 α = 0.05 18 Shaffer, 1995
Methods: Primary Analysis First phase: Used two-sided alpha level of 0.02 to pick factors associated with T2D in individual NHANES cohorts Second phase: Determine how many of these 37 factors are associated with T2D in two or more cohorts (two-sided alpha level of 0.02) 19
Methods: secondary/sensitivity analyses • Reverse causality test: re-run analysis only among people that didn’t report doctor diagnosis of T2D • Lipophilic chemicals: adjusted for total triglycerides and cholesterol • Recent diet: adjusted for diet and supplement use 20
Results: first phase • Identified 37 unique factors (FDR = 10-30%) • Dioxins • Furans • Heavy metals • Nutrient/vitamins • Organochlorine pesticides • Polychlorinated biphenyls • Viruses 21
Results: second phase • Identified 5 unique factors (overall FDR = 2%) • Cis-β-carotene • Trans- β-carotene • γ-tocopherol • Heptachlor Epoxide • PCB170: 2,2',3,3',4,4',5-heptachlorobiphenyl 23
Results: reverse causality? Prevalence OR (95% CI) 25
Results: confounding by lipid levels? Prevalence OR (95% CI) 26
Results: adjusting for diet/supplements? Prevalence OR (95% CI) 27
Discussion • EWAS confirmed previous findings (carotenes and PCB) and provided novel associations (heptachlor epoxide and γ-tocopherol) • Limitations and Strengths? • Dawning of age of “enviromics”? • Next steps? • e.g. cumulative exposure? 28