1 / 39

Genome-wide Association Studies

Genome-wide Association Studies. John S. Witte. Candidate Gene or GWAS. Association Studies. Hirschhorn & Daly, Nat Rev Genet 2005. Affymetrix Array. Genome-wide Association Studies. Altshuler & Clark, Science 2005. Genome-wide Assocation Studies (GWAS). # Markers. # Samples.

Download Presentation

Genome-wide Association Studies

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Genome-wide Association Studies John S. Witte

  2. Candidate Gene or GWAS Association Studies Hirschhorn & Daly, Nat Rev Genet 2005

  3. Affymetrix Array Genome-wide Association Studies Altshuler & Clark, Science 2005

  4. Genome-wide Assocation Studies (GWAS)

  5. # Markers # Samples Discovery: Multi-stage GWAS+ Time GWAS+ Strategy Clarification: Sequencing+ Confirmation / Characterization: Follow-up Genotyping+

  6. GWAS+ Strategy # Markers # Samples Discovery: Multi-stage GWAS+ Time Clarification: Sequencing+ Confirmation / Characterization: Follow-up Genotyping+

  7. One- and Two-Stage GWA Designs Two-Stage Design One-Stage Design SNPs SNPs 1,2,3,……………………………,M 1,2,3,……………………………,M 1,2,3,………………………,N 1,2,3,………………………,N samples Stage 1 Samples Samples Stage 2 markers

  8. One-Stage Design SNPs Samples Two-Stage Design Joint analysis Replication-based analysis SNPs SNPs Samples Stage 1 Stage 1 Samples Stage 2 Stage 2

  9. Multistage Designs • Joint analysis has more power than replication • p-value in Stage 1 must be liberal • Lower cost—do not gain power • http://www.sph.umich.edu/csg/abecasis/CaTS/index.html

  10. QC Steps • Filter SNPs and Individuals • MAF, Low call rates • Test for HWE among controls & within ethnic groups. Use conservative alpha-level • Check for relatedness. Identity-by-state calculations.

  11. Analysis of GWAS • Most common approach: look at each SNP one-at-a-time. • Possibly add in multi-marker information. • Further investigate / report top SNPs only. • Or backwards replication… P-values

  12. GWAS Analysis • Most commonly trend test. • Log additive model, logistic regression. • Adjust for potential population stratification.

  13. Quantile-Quantile (QQ) Plot

  14. Example: GWAS of Prostate Cancer chromosome http://cgems.cancer.gov Multiple prostate cancer loci on 8q24 Witte, Nat Genet 2007

  15. Prostate Cancer Replications Witte, Nat Rev Genet 2009 Modest ORs

  16. Prostate Cancer Replications Witte, Nat Rev Genet 2009 Modest ORs

  17. SNPs Missed in Replication? 24,223 smallest P-value! Witte, Nat Rev Genet, 2009

  18. Prostate Cancer www.genome.gov/gwastudies Manolio et al. Clin Invest 2008

  19. Population Attributable Risks for GWAS Smoking & lung cancer BRCA1 & Breast cancer Jorgenson & Witte, 2009

  20. Limitations of GWAS Example: AUC for Br Cancer Risk Gail = 58% SNPs = 58.9% G + S = 61.8% Wacholder et al. NEJM 2010 • Not very predictive Witte, Nat Rev Genet 2009

  21. Limitations of GWAS • Not very predictive • Explain little heritability • Focus on common variation • Many associated variants are not causal

  22. Where’s the Heritability? Common disease rare variant (CDRV) hypothesis: diseases due to multiple rare variants with intermediate penetrances (allelic heterogeneity) Many more of these? See: NEJM, April 30, 2009 McCarthy et al., 2008

  23. Will GWAS results explain more heritability? • Possibly, if… • Causal SNPs not yet detected due to power / practical issues (e.g., not yet included in replication studies). • Stronger effects for causal SNPs: Associated SNP may only serve as a marker for multiple different causal SNPs.

  24. Imputation of SNP Genotypes • Estimate unmeasured or missing genotypes. • Based on measured SNPs and external info (e.g., haplotype structure of HapMap). • Increase GWAS power. • Allow for combining data across different platforms (e.g., Affy & Illumina) (for replication / meta-analysis).

  25. Imputation Example Study Sample HapMap/ 1K genomes Gonçalo Abecasis

  26. Identify Match with Reference Gonçalo Abecasis

  27. Phase chromosomes, impute missing genotypes Gonçalo Abecasis http://www.sph.umich.edu/csg/abecasis/MACH

  28. Imputation Application TCF7L2 gene region & T2D from the WTCCC data Observed genotypes black Imputed genotypes red. Chromosomal Position Marchini Nature Genetics2007 http://www.stats.ox.ac.uk/~marchini/#software

  29. Genome-wide Sequence Studies • Trade off between number of samples, depth, and genomic coverage. Goncalo Abecasis

  30. Near-term Design Choices • For example, between: • Sequencing few subjects with extreme phenotypes: • e.g., 200 cases, 200 controls, 4x coverage. Then follow-up in larger population. • 10M SNP chip based on 1,000 genomes. • 5K cases, 5K controls. • Which design will work best…?

  31. Many weak associations combine to risk? Score model: where ln(ORi ) = ‘score’ for SNPi from ‘discovery’ sample SNPij = # of alleles (0,1,2) for SNPi, person j in ‘validation’ sample. Large number of SNPs (m) xj associated with disease? Polygenic Models ISC / Purcell et al. Nature 2009

  32. Application of Model Purcell / ISC et al. Nature 2009

  33. Application to CGEMs PCa GWAS Witte & Hoffman 2010 • 1,172 cases, 1,157 controls from PLCO Trial • Oversampled more aggressive cases. • Illumina 550K array. • PCa & stratified by disease aggressiveness. • Split into halves, resampling: • one as ‘discovery’ sample; • other as ‘validation’. • LD filter: r2 = 0.5.

  34. Results for Prostate Cancer

  35. Common Polygenic Model for Prostate and Breast Cancer? • CGEMs GWAS data on prostate and breast cancer. • Use one cancer as ‘discovery’ sample, the other as ‘validation’. Nat Rev Cancer 2010;10:205-212

  36. Results for PCa & BrCa

  37. Complex diseases Physical activity Genetic susceptibility Obesity Hyperlipidemia Diet Diabetes Complex diseases: Many causes = many causal pathways! Vulnerable plaques Hypertension MI Atherosclerosis

  38. Pathways • Many websites / companies provide ‘dynamic’ graphic models of molecular and biochemical pathways. • Example: BioCarta: http://www.biocarta.com/ • May be interested in potential joint and/or interaction effects of multiple genes in one pathway.

  39. Systems Biology Moving Beyond Genome Transcriptome: All messenger RNA molecules (‘transcripts’) Proteome: All proteins in cell or organism Metabolome: all metabolites in a biological organism (end products of its gene expression).

More Related