1 / 46

Contemporary Research in Human GenomICS

Genetics, Ethics and the Law May 29-31, 2009 Josyf Mychaleckyj, D.Phil. Center for Public Health Genomics University of Virginia. Contemporary Research in Human GenomICS. Today we’ll review…. Genome Wide Association Studies (GWAS) Copy Number Variants (CNVs) Medical Resequencing

kiril
Download Presentation

Contemporary Research in Human GenomICS

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Genetics, Ethics and the Law May 29-31, 2009 Josyf Mychaleckyj, D.Phil. Center for Public Health Genomics University of Virginia Contemporary Research in Human GenomICS

  2. Today we’ll review… • Genome Wide Association Studies (GWAS) • Copy Number Variants (CNVs) • Medical Resequencing • Direct-to-Consumer Services (DTC)

  3. Genome Wide Association Studies (GWAS)

  4. Single Nucleotide Polymorphisms: SNPs (‘SNiPs’) Chromosome #1 A C C G T G T G T C Chromosome #2 A C CG C G T G T C C, T are the 2 different alleles for this SNP Mutation = Rare variant Polymorphism = Frequent (> 1% prevalence)

  5. Each person carries pairs of chromosomes with a separate allele at the SNP position on each chromosome 3 Possible SNP Genotypes frequency Homozygote f(AA) A A Heterozygote f(AG) A G Homozygote f(GG) G G f(AA) + f(AG) + f(GG) = 1

  6. Case Control Association study Cases = Clinical Disease Controls = Disease Free eg Blue Allele: 0.48 (48%) 0.41 (41%)

  7. Quantitative Trait Locus (QTL) Association Study

  8. Genome Wide Association Study • SNPs most common type of human genome variant by number (10-15 Million) • Stable, easy to assay, accurately genotype • Able to multiplex 1000’s of SNPs into same assay Illumina 1M-Duo Affymetrix Human 6.0 906,000 SNPS 946,00 probes for CNV

  9. GWAS • SNPs present in genes (affect proteins) but since coding sequence is ~2% of genome, the vast majority of human SNPs are outside exons or introns • Genotype Dense map of SNPs across all chromosomes of the human genome • Studies with 500,000 SNPs are becoming routine and 1 Million SNP panels are available • Do not have to test all 10M SNPs because of SNP-SNP correlations (linkage disequilibrium)

  10. GWAS approach • Does not assume a knowledge of genes or biology Hardy J, Singleton A.N Engl J Med. 2009 Apr 23;360(17):175

  11. Genome wide Association Analysis of Coronary Artery Disease, NEJM 2007

  12. Gene 1 Gene 2 Gene 4 Gene 3 Gene 5 But Common Diseases are Complex Clinical Monogenic Disease Clinical Complex Disease P( Hemochromatosis+ | CC homozyote) ~ 60-100% Environment 1 Environment 2 HFE C282Y Environment 3 OR VPPGEEQRYT[C/Y]QVEHPGLD OR OR rs1800562 GGGGAAGAGCAGAGATATAC GT[A/G]CCAGGTGGAGCACCC AGGCCTG

  13. Monogenic vs Complex Disease Monogenic Complex 1 or small # of genes Many Often etiologic Susceptibility / molecular (severe phenotype) pathology ? Highly penetrant Modest penetrance High Odds Ratio Modest/Low Odds Ratio Strong selection => Weak/No selection => Low frequency/Rare High frequency/Common Coding Sequence Non-coding/regulation (?)

  14. What are GWAS Studies Finding • Typically detected variants are common (allele freq >10%) • low genotype risk, odds ratio (1.1-1.5) • Small sibling relative risk • Causal variants have not been mapped - function unknown and major signals occur in non-coding regions • Penetrance model not well known

  15. Example: Crohn Disease First susceptibility gene NOD2 for Crohn Disease SNP: rs17221417 • GRR (het) = 1.29, GRR Homo = 1.92 • Allele frequency 0.287 • Sibling Risk Ratio = 1.02 • Familial risk in NOD2 has been estimated at 1.19-1.49 but varies with population Lewis J Med Genet 2007, Economou Am J Gastroenterol 2004

  16. >200 GWAS studies published as of December 2008 Hindorff, PNAS 2009

  17. Nature Genetics 41, 666 - 676 (2009) 
Published online: 10 May 2009 Genome-wide association study identifies eight loci associated with blood pressure

  18. The GWAS conundrum: Little variance/risk is explained by GWAS alleles • Obesity • FTO and MC4R <2% of variance • Lipids • 30 gene loci, proportion of variance explained in each trait: • 9.3% for HDL cholesterol • 7.7% for LDL cholesterol • 7.4% for triglycerides • Diabetes • 18 replicated loci: combined sibling relative risk ~1.07

  19. Example: Height • Highly heritable (heritability ~0.8) • Combined sample of ~63,000 • 54 validated variants in multiple genes • Each locus explains ~0.3% - 0.5% of the phenotypic variance • Total variance explained < 5% overall

  20. What are we missing? • Population differences • Alleles with small effect sizes • Copy number variants • Rare variants • Epigenetic effects

  21. Genotype and phenotype datasets made available as rapidly as possible to a wide range of scientific investigators • Grantees are expected to develop a sharing plan consistent with the GWAS policy. • Plan should include data submission to the NIH GWAS data repository (dbGaP). http: grants.nih.gov/grants/guide/notice-files/NOT- OD- 07-088.html) Pezzolesi et al Diabetes 2009

  22. http://www.ncbi.nlm.nih.gov/gap

  23. NIH GWAS Data Sharing Issues • Sharing of individual genotype & phenotype data with any approved researcher worldwide (*Public access to genetic summary statistics) • Review by a central NIH data use committee (DUC) not constituted by the study • Informed consent templates for new GWAS • ‘Retrofitting’ existing cohorts to conform to NIH Policy – adequacy of consents • Data sharing clauses • Use of data for research purposes not intended or foreseen • Ancestry, ethnic origins – harm to community http://grants.nih.gov/grants/gwas/

  24. Example Results for one SNP 0.0 0.25 0.75 1.0 Allele Frequency More Likely to be in mixture Reference Sample Mixture Personal Genome Summation over all SNPs, can infer with very high confidence whether the Person (or a close relative) is more likely to be in the Mixture versus a Reference Sample PloS Genetics Aug 2008

  25. Copy Number Variants (CNVs)

  26. Copy Number Variants • Submicroscopic structural genome rearrangments (cfcytogenetics, FISH) • ~ 10 – 10,000 base pairs in length • Insertions, deletions, duplications (2+ copies), inversions • Copy number variant or polymorphism • polymorphism = more common CNV (> 1% frequency = CNP) • Common feature of the genome • Frequency >1% => polymorphism (CNPs) • Assay using genome wide SNP or CNV arrays • Electronic FISH study

  27. Copy number variants (CNVs) The Copy Number Variation (CNV) Project http://www.sanger.ac.uk/humgen/cnv/

  28. ~11kb deletion on chromosome 8 revealed by ultra-high resolution CGH. Blue lines: individuals with two copies. Red line: individual with zero copies. Points are SNPs or probes from GWAS Array The Copy Number Variation (CNV) Project http://www.sanger.ac.uk/humgen/cnv/

  29. Location and frequency of CNVs in the genome Nature. 2006 Nov 23;444(7118):444-54

  30. Medical Resequencing: Next Generation Sequencing (NGS)

  31. Public Reference Human Genome Sequence (2001, 2004) is Haploid and Chimeric DNA Library 1, Individual 1 DNA Library 2, Individual 2 DNA Library 3, Individual 3

  32. Next Generation Sequencing (NGS) enables Diploid Sequencing of an individual Positions of variants, SNPS, CNVs etc Hundreds of Millions of small random sequence ‘reads’

  33. Mapping of Individual Variants (SNPs, CNVs) N = 1 individual Reference Genome T C Shotgun Reads: A G T G A G T G T G A G

  34. Mapping of Individual Variants • Random reads from diploid genome sequencing • Align random shotgun reads from single individual diploid library & look for high quality mismatches • Find heterozygous positions • Medical Sequencing (to determine disease risk profile) • Incorporation of sequence and variants in the Medical Record

  35. ABBA00000000

  36. ‘Project Jim’ 1.3 percent of Watson’s genome did not match the existing reference genome. > 600,000 novel SNPs < 68,000 insertions and deletions compared to the reference sequence, 3bp - 7kbases Bio-IT World June 2007

  37. NGS of Diploid Genomes 5 Completely Sequenced as of (May 2009): J. Craig Venter James Watson Yoruban (West Africa, HGVS) Chinese (YH) Korean (SJK May 2009) Levy et al, PLoS Biology, 2007

  38. Scientific American 2006

  39. 2008: Announcement of the $5,000 Genome

  40. Direct-to-Consumer Services

  41. Bio-IT World November 2008

  42. Rival genetic tests leave buyers confused Firms that offer to predict your risk of disease give worryingly varied results Nic Fleming (September 7, 2008)

  43. Different Companies produce differing assessments of risk • Different genetic variants reviewed and included – threshold for inclusion • Level of expertise in companies to review literature • Different statistical models for risk prediction – no ‘right’ answer • How frequently updated – new findings in literature

More Related