450 likes | 465 Views
Inferring Transcriptional Regulation Using Transctiptomics. Carsten O. Daub September 1 st , 2014 StratCan Summer School 2014 Vår Gård, Saltsjöbaden. Overview – Levels of Regulation. Genome SNP DNA modifications (e.g. methylation) structural alterations (e.g. genomic rearrangements)
E N D
Inferring Transcriptional Regulation Using Transctiptomics Carsten O. Daub September 1st, 2014 StratCan Summer School 2014 Vår Gård, Saltsjöbaden
Overview – Levels of Regulation • Genome • SNP • DNA modifications (e.g. methylation) • structural alterations (e.g. genomic rearrangements) • Transcriptome • Transcription factors, enhancers/ insulators • Promoter • RNA splicing • miRNA • Posttranscriptional modifications (e.g. RNA editing) • 3D structure of the genome • Protein • Translation • Posttranslational modifications • Metabolites
Central Dogma of Molecular Biology DNA Transcription RNA Non coding RNA Translation Protein Francis Crick, 1958
What is the transcriptome? • The ensemble of all expressed RNA • Protein coding genes • Non-protein coding genes
How is the Transcriptome regulated? • Via Promoter • Transcription factors • enhancers • insulators • RNA splicing • miRNA • Posttranscriptional modifications (e.g. RNA editing) • 3D structure of the genome
Pol 5’ 3’ Transcription • The principle:DNA is copied into RNA by the RNA polymerase (Pol) • Transcription initiation is more complex in eukaryotes than in prokaryotes • In eukaryotes several different factors are necessary for the transcription of an RNA polymerase II promoter.
Initiation • Promoter clearance • Pol2 stalling • Elongation • Termination Figures from http://en.wikipedia.org/wiki/Transcription_(genetics)
Pol 5’ 3’ Transcription Model Transcription Pre-mRNA (precursor) Capping ( ) Splicing Polyadenylation mRNA AAAAAAAAAAA
Transcription Factor (TF) Binding • TFs bind to specific sites in the DNA • Sets of TFs can function as cis-regulatory modules (CRM) Nature Reviews Genetics 5, 276-287 (April 2004)
Specific TF Binding • Transcription factors bind to specific DNA sequences • Databases of TF binding sequence motifs • JASPAR, TRANSFAC IRF8 binding motif DNA IRF8
Distal promoter [-10k, -250] Proximal promoter [-250, -34] Core promoter [-34, -1] Promoter Region Transcription start site (TSS)
Promoter Region • Core promoter – the minimal portion of the promoter required to properly initiate transcription • Transcription Start Site (TSS) • Approximately -34 • A binding site for RNA polymerase • General transcription factor binding sites • Proximal promoter – the proximal sequence upstream of the gene that tends to contain primary regulatory elements • Approximately -250 • Specific transcription factor binding sites • Distal promoter – the distal sequence upstream of the gene that may contain additional regulatory elements, often with a weaker influence than the proximal promoter • Anything further upstream (but not an enhancer or other regulatory region whose influence is positional/orientation independent) • Specific transcription factor binding sites
Transcription in eukaryotes • In eukaryotes, several different factors are necessary for the transcription of an RNA polymerase II promoter.
Identifying the TF regulators • How much is a TF binding site used • Observed expression of all genes • Predicted site count • Motif Activity Response Analysis (MARA)
Replicates Microarray check Deep CAGE RIKEN1 RIKEN3 RIKEN5 RIKEN6 TF qRT-PCR Not good Illumina (47K probes) 10 time points miRNA microarray FANTOM4 – A Systems Approach Monoblast-like THP-1 cells were stimulated by PMA to differentiate them into monocyte-like cells. 10 time point samples were collected during differentiation. Monocyte-like Monoblast-like 0 1 2 4 6 12 24 48 72 96 hour PMA
Cap Analysis of Gene Expression (CAGE) CAGE data digital processing CAGE library preparation Sequencing Figure based on [1] Tag cluster (TC) 1 Carninci, P. et al. Genome-wide analysis of mammalian promoter architecture and evolution. Nature genetics38, 626–35 (2006)
CAGE identifies the active set of promoters Alternative promoter usage for PTPN6 THP-1 Promoter HeLa Promoter Slide modified from Alistair Forrest. Kanamori-Katayama, Itoh, Kawaji et al. 2011 Genome Research. “Unamplified cap analysis of gene expression on a single-molecule sequencer”
Transcriptional Regulation A. TFBS prediction B. Co-expression TF A ×: Average expression CAGE tags ● ● ● × × × Gene B ● ◆ × ■ CAGE Promoter ● ■ ■ ◆ × ■ No of CAGE tags In each promoter ■ ● Gene C × ◆ ◆ ◆ ■ ◆ ● ◆ × Gene D ■ 0h 96h = TFBS prediction × Co-expression Total score High High Low TF A promoter B TF A promoter C TF A promoter D High High Low A: basis: TFBS prediction B: co-expression
Motif Activity Response Analysis – MARA eps Genome Promoter1 m1 m1 m2 m3 m1 Promoter2 m1 m4 PromoterX m1 m5 ・・・・ Expression • Reaction efficiency • Number of possible binding sites • Degree of conservation of the motif • Chromatin status Effective concentration THP-1 cells are a monoblastic leukemia cell line which upon PMA treatment can differentiate into an adherent monocyte like cell (CD14+, CSF1R+) Suzuki, Forrest, van Nimwegen et al. Nature Genetics 2009, 41:5
Motif Activity Response Analysis • How much is a binding site used • Observed expression of all promoters over time • Predicted site count Suzuki, Forrest, van Nimwegen et al. Nature Genetics 2009, 41:5
Nat Genet. 2009 May;41(5):553-62. Nat Genet. 2009 May;41(5):553-62.
Enhancers • Enhancers are sequence motifs • They bind factors (proteins) that are participating in the transcription initiation complex • Enhancers can be many kb away from the TSS • Insulators are acting in a similar way, but repressing expression • Is an enhancer a gene?
Enhancer RNA • ENCODE reported (Nature, 489(7414), 101–108) • Enhancers identified by co-occurrence of H3K27ac and H3K4me1 ChIP-Seq data, centred on P300 binding sites, in HeLa cells • Enhancers make non-coding RNA Nature 465, 173–174 (2010). • Widespread transcription at neuronal activity-regulated enhancers. (Kim, T. K. et al. Widespread transcription at neuronal activity-regulated enhancers. Nature 465, 182–187 (2010).)
Djebali, S., Davis, C. A., Merkel, A., Dobin, A., Lassmann, T., Mortazavi, A., et al. (2012). Landscape of transcription in human cells. Nature, 489(7414), 101–108. doi:10.1038/nature11233
RNA splicing in cancer http://en.wikipedia.org/wiki/RNA_splicing
Example: Melanoma Transcriptome • discovery of aberrations that contribute to carcinogenesis • characterize the spectrum of cancer-associated mRNA alterations through integration of transcriptomic and structural genomic data • 11 novel melanoma gene fusions produced by underlying genomic rearrangements • 12 novel readthrough transcripts Genome Res. 2010 Apr;20(4):413-27
Melanoma Transcriptome: Gene Fusion Connecting genes located on different chromosomes!
Genes fusions are ‘private’ • The same gene fusion was not observed in two melanoma patients (10 samples total) • Gene fusions in melanoma might not be the cancer causing events but consequences
Chromosome Structure Ref: http://www.sequentiabiotech.com/
http://en.wikipedia.org/wiki/Chromosome_conformation_capture
Mouse ES cells Dixon, J. R., Selvaraj, S., Yue, F., Kim, A., Li, Y., Shen, Y., et al. (2012). Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature. doi:10.1038/nature11082
Remote ER-a chromatin biding sites are anchored at gene promoters through long-range chromatin interactions • suggesting that ER-a functions by extensive chromatin looping to bring genes together for coordinated transcriptional regulation Nature. 2009 Nov 5;462(7269):58-64
Polymerase II Stalling stalled active No binding Nature Genetics 39, 1512 - 1516 (2007) • Pol II ChIP-chip in drosophila embryos • Stalled genes are highly enriched in developmental control genes
From observations to mechanisms • Observations => Biomarkers