1 / 28

Human Genetic Variation

Human Genetic Variation. Genetics of Complex Diseases. The Human Genome Project - Goals. Determine the sequences of the 3 billion base pairs that make up human DNA . The Human Genome Project - Goals. Determine the sequences of the 3 billion chemical base pairs that make up human DNA

cargan
Download Presentation

Human Genetic Variation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Human Genetic Variation Genetics of Complex Diseases

  2. The Human Genome Project - Goals • Determine the sequences of the 3 billion base pairs that make up human DNA

  3. The Human Genome Project - Goals • Determinethe sequences of the 3 billion chemical base pairs that make up human DNA • Improve tools for data analysis

  4. The Human Genome Project “What we are announcing today is that we have reached a milestone…that is, covering the genome in…a working draft of the human sequence.” “But our work previously has shown… that having one genetic code is important, but it's not all that useful.” “I would be willing to make a predication that within 10 years, we will have the potential of offering any of you the opportunity to find out what particular genetic conditions you may be at increased risk for…” Washington, DC June, 26, 2000

  5. The Vision of Personalized Medicine Genetic and epigenetic variants + measurable environmental/behavioral factors would be used for a personalized treatment and diagnosis

  6. Example: Warfarin An anticoagulant drug, useful in the prevention of thrombosis.

  7. Example: Warfarin Warfarin was originallyused as rat poison. Optimal dose variesacross the population Genetic variants (VKORC1 and CYP2C9) affect the variation of the personalized optimal dose.

  8. Association Studies Studying complex diseases by comparing cases to controls

  9. Where should we look first? SNP= Single Nucleotide Polymorphism person 1: ….AAGCTAAATTTG…. person 2: ….AAGCTAAGTTTG…. person 3: ….AAGCTAAGTTTG…. person 4: ….AAGCTAAATTTG…. person 5: ….AAGCTAAGTTTG…. Most common SNPs have only two possible alleles.

  10. Associated SNP Disease Association Studies Cases: AGAGCAGTCGACAGGTATAGCCTACATGAGATCGACATGAGATCGGTAGAGCCGTGAGATCGACATGATAGCC AGAGCCGTCGACATGTATAGTCTACATGAGATCGACATGAGATCGGTAGAGCAGTGAGATCGACATGATAGTC AGAGCAGTCGACAGGTATAGTCTACATGAGATCGACATGAGATCGGTAGAGCCGTGAGATCGACATGATAGCC AGAGCAGTCGACAGGTATAGCCTACATGAGATCAACATGAGATCGGTAGAGCAGTGAGATCGACATGATAGCC AGAGCCGTCGACATGTATAGCCTACATGAGATCGACATGAGATCGGTAGAGCCGTGAGATCAACATGATAGCC AGAGCCGTCGACATGTATAGCCTACATGAGATCGACATGAGATCGGTAGAGCAGTGAGATCAACATGATAGCC AGAGCCGTCGACAGGTATAGCCTACATGAGATCGACATGAGATCGGTAGAGCAGTGAGATCAACATGATAGTC AGAGCAGTCGACAGGTATAGCCTACATGAGATCGACATGAGATCTGTAGAGCCGTGAGATCGACATGATAGCC Associated SNP Controls: AGAGCAGTCGACATGTATAGTCTACATGAGATCGACATGAGATCGGTAGAGCAGTGAGATCAACATGATAGCC AGAGCAGTCGACATGTATAGTCTACATGAGATCAACATGAGATCTGTAGAGCCGTGAGATCGACATGATAGCC AGAGCAGTCGACATGTATAGCCTACATGAGATCGACATGAGATCTGTAGAGCCGTGAGATCAACATGATAGCC AGAGCCGTCGACAGGTATAGCCTACATGAGATCGACATGAGATCTGTAGAGCCGTGAGATCGACATGATAGTC AGAGCCGTCGACAGGTATAGTCTACATGAGATCGACATGAGATCTGTAGAGCCGTGAGATCAACATGATAGCC AGAGCAGTCGACAGGTATAGTCTACATGAGATCGACATGAGATCTGTAGAGCAGTGAGATCGACATGATAGCC AGAGCCGTCGACAGGTATAGCCTACATGAGATCGACATGAGATCTGTAGAGCCGTGAGATCGACATGATAGCC AGAGCCGTCGACAGGTATAGTCTACATGAGATCAACATGAGATCTGTAGAGCAGTGAGATCGACATGATAGTC

  11. Genotyping technology AGACTAACC…. ACGAATCCT…. GGACTTACC…. GCACAACCT…. GGGATTAAC.… DNA

  12. Genotype technologies • Cost of genotyping technologies has reduced dramatically in the last decade. • Genotyping one SNP per one individual was > $1 in the beginning of the decade. • Price now is at 0.03 cents. • Exponential growth – doubles every 10 months • Faster than Moore’s law – doubling every 18 months.

  13. HapMap Phase 2 5,000,000+ SNPs 600,000,000+ genotypes TSC Data Nucleic Acids Research 35,000 SNPs 4,500,000 genotypes Perlegen Data Science 1,570,000 SNPs 100,000,000 genotype NCBI dbSNP Genome Research 3,000,000 SNPs 286,000,000 genotypes Daly et al. Nature Genetics 103 SNPs 40,000 genotypes Gabriel et al. Science 3000 SNPs 400,000 genotypes 2001 2002 2003 2004 2005 2007 Public Genotype Data Growth

  14. Association Studies Genetic variants such as Single Nucleotide Polymorphisms (SNPs) are tested for association with the trait.

  15. Published Genome-Wide Associations through 6/2009, 439 published GWA at p < 5 x 10-8 NHGRI GWA Catalog www.genome.gov/GWAStudies

  16. Preliminary Definitions • SNP – single nucleotide polymorphism. A genetic variant which may carry different alleles for different individuals. • Most SNPs are bi-allelic. There are only two observed alleles in the populations. • Risk allele – the allele which is more common in cases than in controls (denoted R) • Nonrisk allele – the allele which is more common in the controls (denoted N)

  17. Other Structural Variants Inversion Deletion Copy number variant

  18. Chance or Real Association? Cases: AGAGCAGTCGACAGGTATAGCCTACATGAGATCGACATGAGATCGGTAGAGCCGTGAGATCGACATGATAGCC AGAGCCGTCGACATGTATAGTCTACATGAGATCGACATGAGATCGGTAGAGCAGTGAGATCGACATGATAGTC AGAGCAGTCGACAGGTATAGTCTACATGAGATCGACATGAGATCGGTAGAGCCGTGAGATCGACATGATAGCC AGAGCAGTCGACAGGTATAGCCTACATGAGATCAACATGAGATCGGTAGAGCAGTGAGATCGACATGATAGCC AGAGCCGTCGACATGTATAGCCTACATGAGATCGACATGAGATCGGTAGAGCCGTGAGATCAACATGATAGCC AGAGCCGTCGACATGTATAGCCTACATGAGATCGACATGAGATCGGTAGAGCAGTGAGATCAACATGATAGCC AGAGCCGTCGACAGGTATAGCCTACATGAGATCGACATGAGATCGGTAGAGCAGTGAGATCAACATGATAGTC AGAGCAGTCGACAGGTATAGCCTACATGAGATCGACATGAGATCTGTAGAGCCGTGAGATCGACATGATAGCC Associated SNP Controls: AGAGCAGTCGACATGTATAGTCTACATGAGATCGACATGAGATCGGTAGAGCAGTGAGATCAACATGATAGCC AGAGCAGTCGACATGTATAGTCTACATGAGATCAACATGAGATCTGTAGAGCCGTGAGATCGACATGATAGCC AGAGCAGTCGACATGTATAGCCTACATGAGATCGACATGAGATCTGTAGAGCCGTGAGATCAACATGATAGCC AGAGCCGTCGACAGGTATAGCCTACATGAGATCGACATGAGATCTGTAGAGCCGTGAGATCGACATGATAGTC AGAGCCGTCGACAGGTATAGTCTACATGAGATCGACATGAGATCTGTAGAGCCGTGAGATCAACATGATAGCC AGAGCAGTCGACAGGTATAGTCTACATGAGATCGACATGAGATCTGTAGAGCAGTGAGATCGACATGATAGCC AGAGCCGTCGACAGGTATAGCCTACATGAGATCGACATGAGATCTGTAGAGCCGTGAGATCGACATGATAGCC AGAGCCGTCGACAGGTATAGTCTACATGAGATCAACATGAGATCTGTAGAGCAGTGAGATCGACATGATAGTC

  19. Hypothesis testing • We want to distinguish between two hypotheses: • Null hypothesis: the allele frequency in the cases and the controls is the same (the SNP has nothing to do with the disease) • Alternative hypothesis: the allele frequency in the cases and in the controls is different (the SNP is correlated with the disease). • Intuitively, we want to ask how likely is the null hypothesis.

  20. How does it work? • For every SNP we can construct a contingency table: • From the table we construct a statistic . • The likelihood that under the null hypothesis we get T or a bigger number is a p-value.

  21. Example: • For every SNP we can construct a contingency table: T = 0.02. The p-value is 0.8875 (88% chance of getting T > 0.02)

  22. Example: • For every SNP we can construct a contingency table: T = 11.11 The p-value is low = 0.001 = 10-3

  23. Example: • For every SNP we can construct a contingency table: T = 83.33 The p-value is extremely low = 10-19

  24. Results: Manhattan Plots

  25. Challenges

  26. Challenge 1: Corrections of multiple testing • In a typical Genome-Wide Association Study (GWAS), we test millions of SNPs. • If we set the p-value threshold for each test to be 0.05, by chance we will “find” about 5% of the SNPs to be associated with the disease. • This needs to be corrected. Different statistical methods are used.

  27. Challenge 2: Correcting genotyping errors • How can we detect genotyping errors? • Hardy-Weinberg Equilibrium • If we have Mother-father-child trios we can check Mendelian consistency.

More Related