1 / 12

Manolis Kellis: Research synopsis

Manolis Kellis: Research synopsis. Why biology in a computer science group? Big biological questions: Interpreting the human genome. Revealing the logic of gene regulation. Principles of evolutionary change. Underlying computational techniques:

lelia
Download Presentation

Manolis Kellis: Research synopsis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Manolis Kellis: Research synopsis • Why biology in a computer science group? • Big biological questions: • Interpreting the human genome. • Revealing the logic of gene regulation. • Principles of evolutionary change. • Underlying computational techniques: • Comparative genomics: evolutionary signatures • Regulatory genomics: motifs, networks, models • Epigenomics: chromatin states, dynamics, disease • Phylogenomics: evolution at the genome scale • Defining characteristics of research program: • Genome-wide rules, exploit nature of problems, interdisciplinary collaborations, biology impact Brief overview 1 slide each vignette

  2. (1) Comparative genomics: evolutionary signatures • Protein-coding signatures • 1000s new coding exons • Translational readthrough • Overlapping constraints • Non-coding RNA signatures • Novel structural families • Targeting, editing, stability • Structures in coding exons • microRNA signatures: • Novel/expanded miR families • miR/miR* arm cooperation • Sense/anti-sense switches • Regulatory motif signatures • Systematic motif discovery • Regulatory motif instances • TF/miRNA target networks • Single binding-site resolution

  3. (2) Regulatory genomics: circuits, predictive models • ENCODE/modENCODE • 4-year effort, dozens of experimental labs • Integrative analysis • Systematic genome annotation • Flagship NIH project • Initial annotation of the non-coding genome, from 20% to 70% • Systems biology for an animal genome for the first time possible • Students and postdocs are co-first authors, leadership roles • Predictive models of gene regulation • Infer networks • Predict function • Predict regulators • Predict gene expression

  4. (3) Phylogenomics: Bayesian gene-tree reconstruction Generative model Two components of gene evolution 2. Species-specific rates 1. Family rate Si Fj ~normal(μi,σi) ~gamma (α,β) New phylogenomic pipeline Selective pressures on gene function Sequence likelihood Branch length prior Topology prior Length I, Topology T, Reconciliation R Bayesian formulation Population dynamics of the species HKY model (traditional) Learned Fj,Si distributions Birth-Death process Alignment data D, species-level parameters θ

  5. Vignette: Epigenomics Jason Ernst, PouyaKheradpour Ernst and Kellis, Nature Biotech, 2010 Ernst, Kheradpour et al, Nature, 2011 (in press)

  6. Epigenomics and ‘chromatin state’ signatures Promoter states DNA • Learn de novo combinations of chromatin marks • Reveal functional elements • Use for genome annotation • Use for studying dynamics across many cell types Transcribed states Histone tails Active Intergenic Repressed Chromatin ‘marks’

  7. ChromHMM: learning ‘hidden’ chromatin states Transcription Start Site Enhancer DNA Transcribed Region Observed chromatin marks. Called based on a poisson distribution K4me3 K4me3 K4me1 K4me1 K36me3 K36me3 K36me3 K36me3 K27ac K4me1 Most likely Hidden State 5 2 1 3 5 5 6 6 6 6 4 6 High Probability Chromatin Marks in State 0.8 0.8 1: 0.7 200bp intervals 4: All probabilities are learned de novo from chromatin data alone (Baum-Welch aka. EM) K27ac K4me1 K4me1 0.9 0.8 2: 5: K4me1 K4me3 Each state: vector of emissions, vector of transitions 3: 6: 0.9 0.9 K4me3 K36me3

  8. Chromatin states dynamics across nine cell types • State definitions are cell-type invariant • Same combinations consistently found • State locations are cell-type specific • Can study pair-wise or multi-way changes

  9. Multi-cell activity profiles and their correlations Gene expression Chromatin States Active TF motif enrichment TF regulator expression Dip-aligned motif biases TF On TF Off Motif aligned Flat profile Motif enrichment Motif depletion ON OFF Active enhancer Repressed Chromatin state & gene expression  link enhancers and target genes TF motif enrichment & TF expression  reveal activators / repressors

  10. Coordinated activity reveals enhancer links Predicted regulators Enhanceractivity Geneactivity • Enhancer networks: Regulator  enhancer  target gene • Ex1: Oct4 predicted activator of embryonic stem (ES) cells • Ex2: Ets activator of GM/HUVEC (but not either one alone) Activity signatures for each TF

  11. xx Revisiting disease- associated variants • Disease-associated SNPs enriched for enhancers in relevant cell types • E.g. lupus SNP in GM enhancer disrupts Ets1 predicted activator

  12. Contributions Science Nature Nature Nature Nature Nature Nature Nature Nature In review We aim to further our understanding of the human genome by computational integration of large-scale functional and comparative genomics datasets. • We use comparative genomics of multiple related species to recognize evolutionary signatures of protein-coding genes, RNA structures, microRNAs, regulatory motifs, and individual regulatory elements. • We use combinations of epigenetic modifications to define chromatin states associated with distinct functions, including promoter, enhancer, transcribed, and repressed regions, each with distinct functional properties. • We develop phylogenomic methods to study differences between species and to uncover evolutionary mechanisms for the emergence of new gene functions Our methods have led to numerous new insights on diverse regulatory mechanisms, uncovered evolutionary principles, and provide mechanistic insights for previously uncharacterized disease-associated SNPs Nature Biotech Nature Nature PLoS Genetics Nature Gen Genes&Dev Nature Nature Biotech MBE Genome Research Nature Nature Nature Nature Nature WBpress Genome Research Nature Genome Research PLoS Comp. Bio. PNAS Nature G.R. BioChem Genes & Development Genome Research Nature GenomRes Nature G.R. Science Nature PNAS BMC Evo. Bio. ACM TKDD RECOMB RECOMB Genome Research RECOMB J. Comp. Bio. PNAS

More Related