1 / 28

Genome-wide Discovery and Characterization of Chromatin S tates

Discover the landscape of chromatin states using ChromHMM software applied to diverse datasets from different cell types and genomic features. Learn about chromatin marks, models, and integrative analysis with ENCODE data.

elwoodt
Download Presentation

Genome-wide Discovery and Characterization of Chromatin S tates

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Genome-wide Discovery and Characterization of Chromatin States Jason Ernst Kellis Lab/Bernstein ENCODE group CSAIL MIT and Broad Institute

  2. Outline • Overview of ChromHMM for discovering chromatin states Application to 41 chromatin marks in CD4+T cells • Extending ChromHMM to multiple cell types  Application to Bernstein ENCODE Data • Extending ChromHMM to integrate diverse data types  Application to ENCODE Consortium Data

  3. Outline • Overview of ChromHMM for discovering chromatin states Application to 41 chromatin marks in CD4+T cells • Extending ChromHMM to multiple cell types  Application to Bernstein ENCODE Data • Extending ChromHMM to integrate diverse data types  Application to ENCODE Consortium Data

  4. Cartoon Illustration of ChromHMM Transcription Start Site Enhancer DNA Transcribed Region Observed Chromatin Marks K4me3 K4me3 K4me1 K4me1 K36me3 K36me3 K36me3 K36me3 e.g. H3K4me3 K27ac K4me1 Most likely Hidden State 5 2 1 3 5 5 6 6 6 6 4 6 High Probability Chromatin Marks in State 0.8 0.8 1: 0.7 200bp intervals 4: All probabilities are learned from the data K27ac K4me1 K4me1 0.9 0.8 2: 5: K4me1 K4me3 3: 6: 0.9 0.9 K4me3 K36me3

  5. Chromatin Marks from (Barski et al, Cell 2007; Wang et al Nature Genetics, 2008); DNAseI hypersensitivity from (Boyle et al, Cell 2008); TF binding enrichment computed based on 14 published TF binding experiments; Expression Data from (Su et al, PNAS 2005)

  6. Transition Matrix State To State From

  7. Core Promoter States Proportion of State Different promoter states show distinct functional enrichment Distance to TSS

  8. Transcribed Regions Enrichment (decr.) Enrichment (decr.) Percentage of Gene Length Position relative to exon start Enrichment (incr.) Enrichment (incr.)

  9. Intergenic Active Regions Fold Enrichment Percentage of Gene Length Relative to TSS

  10. Large-scale repressed and repetitive regions Specific repeat elements enriched for specific states Transition Matrix snapshot State From State To Probability of transition

  11. Marks that have been profiled in several ENCODE cell types by the Broad Institute (PI: Bernstein)

  12. Outline • Overview of ChromHMM for discovering chromatin states Application to 41 chromatin marks in CD4+T cells • Extending ChromHMM to multiple cell types  Application to Bernstein ENCODE Data • Extending ChromHMM to integrate diverse data types  Application to ENCODE Consortium Data

  13. 10 State learned from three ENCODE cell lines Enrichment for HUVEC (other cell types similar) Model learned from K562, HUVEC, NHEK data

  14. Comparing chromatin states across cell types HUVEC NHEK K562 Pairwise state fold enrichments Proportion of genome K562 CTCF island state (State 9) highly stable across cell types HUVEC NHEK

  15. Comparing chromatin states across cell types HUVEC NHEK K562 K562 HUVEC GO Enrichment for TSS in Active promoter state (1) in NHEK and unmodified state (7) in HUVEC NHEK NHEK HUVEC

  16. Extending ChromHMM to Multiple Cell Types Concatenate Genomes Stack Features Independent Models K562 Genome Genome Gm12878 K562 H3K4me3 in K562 H3K4me3 H3K4me3 H3K27me3 H3K4me3 in Gm12878 H3K27me3 H3K27me3 in K562 Gm12878 Genome H3K27me3 in Gm12878 H3K4me3 H3K27me3

  17. Outline • Overview of ChromHMM for discovering chromatin states Application to 41 chromatin marks in CD4+T cells • Extending ChromHMM to multiple cell types  Application to Bernstein ENCODE Data • Extending ChromHMM to integrate diverse data types  Application to ENCODE Consortium Data

  18. Integrating Diverse Data with ChromHMM Transcription Start Site Enhancer DNA Transcribed Region Observed Chromatin Marks K4me3 K4me3 K4me1 K4me1 K36me3 K36me3 K36me3 K36me3 e.g. H3K4me3 K27ac K4me1 Additional Genomic Datasets cMyc e.g. cMyc DNaseI DNaseI DNaseI Most likely Hidden State 5 3 5 5 6 6 6 1 2 6 4 6 High Probability Chromatin Marks in State 0.8 0.8 0.7 1: 0.7 4: 200bp intervals K27ac K4me1 DNaseI K4me1 0.9 0.8 2: 0.7 5: DNaseI K4me3 K4me1 3: 6: 0.9 0.8 0.9 0.9 cMyc DNaseI K4me3 K36me3

  19. Integrative Analysis on ENCODE Consortium Data

  20. State 21 Concentrated near Transcription Termination Sites Distribution Relative to nearest RefSeq TTS

  21. State 0 Highly Specific to TSS Distribution Relative to nearest RefSeq TSS

  22. TSS of Genes of Distinct Function Enriched in Different States

  23. TSS of Genes of Distinct Function Enriched in Different States

  24. TSS of Genes of Distinct Function Enriched in Different States

  25. TSS of Genes of Distinct Function Enriched in Different States

  26. Summary • ChromHMM learns de-novo chromatin states from a large number of chromatin marks • Applications: • Single cell type, lots of marks  Chromatin states • Multiple cell types  Chromatin dynamics • Diverse input tracks  Data integration • Going forward: • Integration with regulatory motifs • Sequence determinants of chromatin / expression

  27. Acknowledgements Members of the Kellis Lab and Bernstein ENCODE group • Manolis Kellis • Bradley Bernstein • Chuck Epstein • Pouya Kheradpour • Michael Lin • Tarjei Mikkelsen • Noam Shoresh Funding: NIH/NHGRI

  28. Questions?

More Related