380 likes | 484 Views
Molecular & Genetic Epi 217 Association Studies: Indirect. John Witte. Homework, Question 4: Haplotypes. ID MTHFR_C677T MTHFR_A1298C Haplotypes? 959 CC AA C-A / C-A 1044 CC AC C-A / C-C 147 CT AA C-A / T-A 123 CT AC C-A / T-C or C-C / T-A.
E N D
Molecular & Genetic Epi 217Association Studies: Indirect John Witte
Homework, Question 4: Haplotypes ID MTHFR_C677T MTHFR_A1298C Haplotypes? 959 CC AA C-A / C-A 1044 CC AC C-A / C-C 147 CT AA C-A / T-A 123 CT AC C-A / T-C or C-C / T-A • Genotypes 677TT and 1298CC never observed together: Suggests most • Probable haplotype, and potential selection or chance. • Rare variants: not necessarily lethal, especially those that are associated • with late onset diseases.
3 SNPs in the TAS2R38 Gene P A V P A I P V V P V I A A V A A I A V V A V I
TASR: 3 SNPs form Haplotypes P A V Taster Non-taster A V I
G/C 3 G/A 2 T/C 4 G/C 5 A/T 1 A/C 6 G G A A G T G A C C C C C C C C T T A A G G C C high r2 high r2 high r2 • SNPs are correlated (aka Linkage Disequilibrium) Too many MTHFR SNPsSolution: Tag SNP Selection Pairwise Tagging: SNP 1 SNP 3 SNP 6 3 tags in total Test for association: SNP 1 SNP 3 SNP 6 Carlson et al. (2004) AJHG 74:106
Common Measures of Coverage • Threshold Measures • e.g., 73% of SNPs in the complete set are in LD with at least one SNP in the genotyping set at r2> 0.8 • Average Measures • e.g., Average maximum r2 = 0.84
Coverage and Sample Size • Sample size required for Direct Association, n • Sample size for Indirect Association n* = n/ r2 • For r2 = 0.8, increase is 25% • For r2 = 0.5, increase is 100%
Tag SNPs Database Resources http://www.hapmap.org http://gvs.gs.washington.edu/GVS/index.jsp
HapMap • Re-sequencing to discover millions of additional SNPs; deposited to dbSNP. • SNPs from dbSNP were genotyped • Looked for 1 SNP every 5kb • SNP Validation • Polymorphic • Frequency • Haplotype and Linkage Disequilibrium Estimation • LD tagging SNPs
HapMap Phase III Populations • ASW African ancestry in Southwest USA • CEU Utah residents with Northern and Western European ancestry from the CEPH collection • CHB Han Chinese in Beijing, China • CHD Chinese in Metropolitan Denver, Colorado • GIH Gujarati Indians in Houston, Texas • JPT Japanese in Tokyo, Japan • LWK Luhya in Webuye, Kenya • MEX Mexican ancestry in Los Angeles, California • MKK Maasai in Kinyawa, Kenya • TSI Toscani in Italia • YRI Yoruba in Ibadan, Nigeria
Tag SNPs: HapMap & Haploview http://www.broad.mit.edu/mpg/haploview/
Tag SNPs: HapMap Summary • Identified 33 common MTHR SNPs (MAF > 5%) among Caucasians • Forced in 3 potentially functional/previously associated SNPs • Identified tag based on pairwise tagging • 15 tags SNPs could capture all 33 MTHR SNPs (mean r2 = 97%) • Note: number of SNPs required varies from gene to gene and from population to population
One- and Two-Stage GWA Designs Two-Stage Design One-Stage Design SNPs SNPs 1,2,3,……………………………,M 1,2,3,……………………………,M 1,2,3,………………………,N 1,2,3,………………………,N samples Stage 1 Samples Samples Stage 2 markers
One-Stage Design SNPs Samples Two-Stage Design Joint analysis Replication-based analysis SNPs SNPs Samples Stage 1 Stage 1 Samples Stage 2 Stage 2
Multistage Designs • Joint analysis has more power than replication • p-value in Stage 1 must be liberal • Lower cost—do not gain power • http://www.sph.umich.edu/csg/abecasis/CaTS/index.html
Complex diseases Physical activity Genetic susceptibility Obesity Hyperlipidemia Diet Diabetes Complex diseases: Many causes = many causal pathways! Vulnerable plaques Hypertension MI Atherosclerosis
Pathways • Many websites / companies provide ‘dynamic’ graphic models of molecular and biochemical pathways. • Example: BioCarta: http://www.biocarta.com/ • May be interested in potential joint and/or interaction effects of multiple genes in one pathway.
Interactions • “The interdependent operation of two or more causes to produce or prevent an effect” • “Differences in the effects of one or more factors according to the level of the remaining factor(s)” • Last, 2001
Why look for interactions? • Improve detection of genetic (& environmental) risks. • Understand etiology/biology • New hypotheses? • Diagnostics • Prevention and interventions
19 2.8 Micronutrient X 0.6 0.2 0.1 Environmental exposure Y 25 2.7 5.2 Other gene Z Drinker? 16 2.1 0.1 0.1 Within particular subgroups, effect of gene may be quite high or low 21 Dilution of effects Gene A OR=1.5
Statistical vs. Biological Interactions • Not identical. • One hypothesizes biological interaction • But ‘tests’ for statistical interaction • Does statistical evidence support our biological hypothesis?
Additive “effect” RER = (OR(E,G)-1)/((OR(E,g)-1)+(OR(e,G)-1)) = (2.4-1)/((2.0-1)+(1.4-1)) = 1.0 2.8/2.0 7.8/2.0 = 1.0 = 2.8 = = 1.4/1.0 1.4/1.0 Multiplicative “effect” (ORs, RRs) Multiplicative interaction (ORs, RRs) Departure from =1 is a multiplicative interaction Multiplicative vs. Additive Interactions RER = relative excess risk
Additive interaction: G1 and E5: independent risk factors Multiplicative interaction: G2 and E2: work through same pathway Two possible causal pathways: additive and multiplicative interaction for colorectal cancer If factors are not known to act independently, use multiplicative. Brennan, P. Carcinogenesis 2002 23:381-387
Analysis of Multiple Genes • Joint / Additive • Multiplicative • Increasing complexity
More Complex Modeling • Multifactor-dimensionality reduction • (Moore & Williams, Ann Med 2002) • Logic regression • (Kooperberg & Ruczinski, Genetic Epi 2005) • Multi-loci analysis • (Marchini, Donnelly, Cardon, Nat Genet 2005) • Bayesian epistasis association mapping • (Zhang & Liu, Nat Genet 2007)