1 / 15

Big Data Opportunities and Challenges in Human Disease Genetics & Genomics

Big Data Opportunities and Challenges in Human Disease Genetics & Genomics. Manolis Kellis. Broad Institute of MIT and Harvard. MIT Computer Science & Artificial Intelligence Laboratory. Big data Opportunities & Challenges in human disease genetics & genomics.

afram
Download Presentation

Big Data Opportunities and Challenges in Human Disease Genetics & Genomics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Big Data Opportunities and Challengesin Human Disease Genetics & Genomics Manolis Kellis Broad Institute of MIT and Harvard MIT Computer Science & Artificial Intelligence Laboratory

  2. Big data Opportunities & Challenges in human disease genetics & genomics • The goal: Mechanistic basis of human disease • Epigenomics: Enhancers, networks, regulators, motifs • Genetics: GWAS, QTLs, molecular epidemiology • The challenges / opportunities: • Effects are very small, huge number of hypotheses • Much larger cohorts are needed, consent limitations • Technologies for privacy vs.excuse for data hoarding • Overcoming the challenges: • Case study: Schizophrenia, Alzheimer’s • Collaboration & sharing: personal & technological

  3. Bringing knowledge gap from genetics to disease Control regions Chromatinstates Promoter Enhancer Insulator Silencer Circuitry Tissue Cell Type Target genes Protein miRNA Heart Intermediate effects Retina Genetic Variant Lipids Tension Eye drusen Metabolism Drug response Cortex CATGACTG Disease TIMP3 CATGCCTG Lung Blood ncRNA Skin Nerve Factors Environment Requires: systematic understanding of genome function

  4. The most complete map of human gene regulation • 2.3M regulatory elements across 127 tissue/cell types • High-resolution map of individual regulatory motifs • Circuitry: regulatorsregionsmotifstarget genes

  5. Non-coding variants lie in tissue-specific regulatory regions • Yield new insights on relevant tissues and pathways • Enable linking non-coding elements to relevant target genes • Provide a mechanistic basis for developing therapeutics

  6. Control regions harbor 1000s weak-effect disease SNPs • GWAS top hits only explain small fraction of trait heritability • Functional enrichments well past genome-wide significance

  7. Bayesian integration of weak effects  disease modules Poorly ranked SNP nearby Highly ranked SNP nearby Disease gene Genetic association Disease SNP • MAZ no direct assoc, but clusters w/ many T1D hits • MAZ indeed known regulator of insulin expression

  8. Brain methylation changes in Alzheimer’s patients MAP Memory and Aging Project+ ROS Religious Order Study Dorsolateral PFC Genotype(1M SNPsx700 ind.) • Variation in methylation patterns largely genotype driven • Global signature of repression in 1000s regulatory regions: hypermethylation, enhancer states, brain regulator targets Reference Chromatin states Methylation (450k probes x 700 ind)

  9. Big data Opportunities & Challenges in human disease genetics & genomics • The goal: Mechanistic basis of human disease • Epigenomics: Enhancers, networks, regulators, motifs • Genetics: GWAS, QTLs, molecular epidemiology • The challenges / opportunities: • Effects are very small, huge number of hypotheses • Much larger cohorts are needed, consent limitations • Technologies for privacy vs.excuse for data hoarding • Overcoming the challenges: • Case study: Schizophrenia, Alzheimer’s • Collaboration & sharing: personal & technological

  10. Big data Opportunities & Challenges in human disease genetics & genomics • The goal: Mechanistic basis of human disease • Epigenomics: Enhancers, networks, regulators, motifs • Genetics: GWAS, QTLs, molecular epidemiology • The challenges / opportunities: • Effects are very small, huge number of hypotheses • Much larger cohorts are needed, consent limitations • Technologies for privacy vs.excuse for data hoarding • Overcoming the challenges: • Case study: Schizophrenia, Alzheimer’s • Collaboration & sharing: personal & technological

  11. Scaling of QTL discovery power w/ sample • Number of meQTLs continues to increase linearly • Weak-effect meQTLs: median R2<0.1 after 400 indiv.

  12. Inflection point in complex trait GWAS Incl. replication (~100K) Freeze May 2013 (~80K) Freeze Jan. 2013 (~70K) WCPG Hamburg 2012 (~65K) Incl. SWE + CLOZUK (~60K)

  13. Schizophrenia GWAS: Number of significant loci 3,500 cases  0 loci 10,000 cases  5 loci 35,000 cases  62 loci!

  14. Similar inflection point found in every complex trait! • Same story in: • Type 1 diabetes • Type 2 diabetes • Serum cholesterol level • Every common chronic disease Significantly associated regions (p < 5e-08) Larger samples lead to new biological insights • Proof that Schizophrenia is a heritable, medical disorder • Genetic architecture similar to non-brain diseases and traits • Many genes  recognition of key pathways and processes • Voltage-gated calcium channels(CACNA1C, CACNA1D, CACNA1I, CACNB2) • Proteins interacting with FMRP, fragile X gene • Neuron organization: Postsynaptic density, dendritic spine heads • Enhancers: brain (angular gyrus, inferior temporal lobe), immune

  15. Big data Opportunities & Challenges in human disease genetics & genomics • The goal: Mechanistic basis of human disease • Epigenomics: Enhancers, networks, regulators, motifs • Genetics: GWAS, QTLs, molecular epidemiology • The challenges / opportunities: • Effects are very small, huge number of hypotheses • Much larger cohorts are needed, consent limitations • Technologies for privacy vs.excuse for data hoarding • Overcoming the challenges: • Collaboration, consortia, sharing of datasets • Case study: Schizophrenia, Alzheimer’s

More Related