450 likes | 606 Views
Genome Biology and Biotechnology. 9. The localizome. Prof. M. Zabeau Department of Plant Systems Biology Flanders Interuniversity Institute for Biotechnology (VIB) University of Gent International course 2005. Summary. DNA localizome or DNA interactome
E N D
Genome Biology and Biotechnology 9. The localizome Prof. M. Zabeau Department of Plant Systems Biology Flanders Interuniversity Institute for Biotechnology (VIB) University of Gent International course 2005
Summary • DNA localizome or DNA interactome • Genome-wide mapping of DNA binding proteins • Transcription factor binding sites • Localization of replication origins • Protein localizome • High throughput localization of proteins in cellular compartments
Functional Mapsor “-omes” Genes or proteins 1 2 3 4 5 n “Conditions” Genes ORFeome Mutational phenotypes Phenome Transcriptome Expression profiles DNA Interactome Protein-DNA interactions Localizome Cellular, tissue location Interactome Protein interactions Proteome proteins After: Vidal M., Cell, 104, 333 (2001)
Genome-wide Analysis of Regulatory Sequences • Gene expression is regulated by transcription factors selectively binding to regulatory regions • protein–DNA interactions involve sequence-specific recognition • Other factors, such as chromatin structuremay be involved • Sequence-specific DNA-binding proteins from eukaryotes generally • recognize degenerate motifs of 5–10 base pairs • Consequently, potential recognition sequences for transcription factors occur frequently throughout the genome • Genome-wide surveys of in vivo DNA binding proteins • provides a platform to answer these questions
Genome-wide Analysis of Regulatory Sequences • Methods combine • Large-scale analysis of in vivo protein–DNA crosslinking • microarray technology • ChIP-on-chip • Chromatin Immuno-Precipitation on DNA chips Reprinted from: Biggin M., Nature Genet.28, 303 (2001)
Genome-Wide Location and Function of DNA Binding Proteins • Paper presents • proof of principle for microarray-based approaches to determine the genome-widelocation of DNA-bound proteins • Study of thebinding sites of a couple of well known gene-specific transcription activators in yeast: Gal4 and Ste12 • Combines data from • in vivo DNA binding analysis with • expression analysis • to identifygenes whose expression is directly controlled by these transcription factors Ren et. al., Science, 290, 2306 (2000)
ChromatinImmuno Precipitation (Chip) Procedure • Cells arefixed with formaldehyde, harvested, and sonicated • DNA fragments cross-linked to a protein of interestare enriched by immunoprecipitation with a specific antibody • Immuno-precipitated DNA is amplifiedandlabeled with the fluorescent dye Cy5 • Control DNA not enrichedby immunoprecipitation is amplifiedandlabeled with thedifferent fluorophore Cy3 • DNAs are mixed and hybridized to a microarray of intergenic sequences • The relative binding of theprotein of interest to each sequence is calculated from the IP-enriched/unenriched ratioof fluorescence from 3 experiments Reprinted from: Ren et. al., Science, 290, 2306 (2000)
Modified ChromatinImmuno Precipitation (Chip) Procedure Close-up of a scanned image of a micro-array containing 6361 intergenic region DNA fragments of the yeast genome ChIP-enriched DNA fragment Reprinted from: Ren et. al., Science, 290, 2306 (2000)
Proof of concept: Gal4 transcription factor • Identification of sites bound by the transcriptionalactivator Gal4 in the yeast genome and genes induced by galactose • Gal4 activates genes necessaryfor galactose metabolism • The best characterized transcription factor in yeast • 10 genes werebound by Gal4 and induced in galactose • 7 genes in the Gal pathway, previously reported to be regulated by Gal4 • 3 novel genes: MTH1, PCL10, and FUR4 Reprinted from: Ren et. al., Science, 290, 2306 (2000)
Genome-wide location of Gal4 protein Genes whose promoter regions are bound by Gal4 and whose expression levels were induced at least twofold by galactose Reprinted from: Ren et. al., Science, 290, 2306 (2000)
Fur4 Pcl10 MTH1 Role of Gal4 in Galactose-dependent Cellular Regulation The identification of MTH1, PCL10, and FUR4 as Gal4-regulated genes explains how regulation of several different metabolic pathways can be coordinated increases intracellular pools of uracil reduces levels of glucose transporter Reprinted from: Ren et. al., Science, 290, 2306 (2000)
Conclusions • The genes whoseexpression is controlled directly by transcriptional activatorsin vivo • Areidentified bya combination of genome-wide location and expression analysis • Genome-wide location analysisprovides information • On the binding sites at which proteins residein the genome under in vivo conditions
Genomic Binding Sites of the Yeast Cell-cycle Transcription Factors SBF and MBF • Paper presents • The use of CHIP and DNA microarrays to define the genomic binding sites of the SBF and MBF transcription factors in vivo • The SBF and MBF transcription factors are active in the initiation of the cell division cycle (G1/S) in yeast • A few target genes of SBF and MBF are known but the precise roles of these two transcription factors are unknown • The two transcription factors are heterodimers containing the same Swi6 subunit and a DNA binding subunit • MBF is a heterodimer of Mbp1 and Swi6 • SBF is a heterodimer of Swi4 and Swi6 Iyer et al., Nature 409: 533 (2001)
Genomic targets of SBF and MBF Reprinted from: Iyer et al., Nature 409: 533 (2001)
In Vivo Targets of SBF and MBF • The CHIP experiments identified • 163 possible targets of SBF • 87 possible targets of MBF • 43 possible targets of both factors • Support for the possible in vivo targets • Most of the genes downstream of the putative binding sites peak in G1/S • Target genes are highly enriched for functions related to DNA replication, budding and the cell cycle • In vivo binding sites are highly enriched for sequences matching the defined consensus binding sites Reprinted from: Iyer et al., Nature 409: 533 (2001)
Transcriptome data for synchronized cell cultures Expression Profiles of SBF and MBF Targets Reprinted from: Iyer et al., Nature 409: 533 (2001)
Expression Profiles of SBF and MBF Targets • Why are two different transcription factors used to mediate identical transcriptional programmes during the cell-division cycle in yeast? • A possible answer is suggested by differences in the functions of the genes that they regulate • Many of the targets of SBF have roles in cell-wall biogenesis and budding • 25% of the MBF target genes have known roles in DNA replication, recombination and repair • The results support a model in which • SBF is the principal controller of membrane and cell-wall formation • MBF primarily controls DNA replication • The need for DNA replication and membrane / cell-wall biogenesis may be different in the mitotic and meiotic cell cycle Reprinted from: Iyer et al., Nature 409: 533 (2001)
A high-resolution map of active promoters in the human genome Kim et. al., Nature 436: 876-880 (2005) • Paper presents • a genome-wide map of active promoters in human fibroblast cells • determined by experimentally locating the sites of RNA polymerase II preinitiation complex (PIC) binding • map defines 10,567 active promoters corresponding to • 6,763 known genes • >1,196 un-annotated transcriptional units • Global view of functional relationships in human cells between • transcriptional machinery • chromatin structure • gene expression
Identification of active promoters in the human genome • Microarrays cover • All non-repeat DNA at 100 bp resolution • Pol II preinitiation complex (PIC) • RNA polymerase II • transcription factor IID • general transcription factors • ChIP of PIC-bound DNA • monoclonal antibody against TAF1 subunit of the complex (TBP associated factor 1 ) Reprinted from: Kim et. al., Nature 436: 876-880 (2005)
Results from TFIID ChIP-on-chip analysis Reprinted from: Kim et. al., Nature 436: 876-880 (2005)
Characterization of active promoters • Matched the 12,150 TFIID-binding sites to • the 5' end of known transcripts in transcript databases • 87% of the PIC-binding sites were within 2.5 kb of annotated 5' ends of known messenger RNAs • 8,960 promoters were mapped • within annotated boundaries of 6,763 known genes in the EnsEMBL genes Reprinted from: Kim et. al., Nature 436: 876-880 (2005)
The chromatin-modification features of the active promoters • Validation of active promoters • ChIP-on-chip using an anti-RNAP antibody • ChIP-on-chip analysis using • anti-acetylated histone H3 (AcH3) antibodies • anti-dimethylated lysine 4 on histone H3 (MeH3K4) antibodies • known epigenetic markers of active genes Reprinted from: Kim et. al., Nature 436: 876-880 (2005)
TFIID, RNAP, AcH3 and MeH3K4 profiles on the promoter of RPS24 gene Reprinted from: Kim et. al., Nature 436: 876-880 (2005)
Additional findings • Promoters of non-coding transcripts • Are very similar to promoters of protein coding genes • Promoters of novel genes • Estimate 13% of human genes remain to be annotated in the genome • Clustering of active promoters • co-regulated genes tend to be organized into coordinately regulated domains • Genes using multiple promoters Reprinted from: Kim et. al., Nature 436: 876-880 (2005)
Multiple promoters in human genes • WEE1 gene locus • Two different transcripts with alternative 5’ends • Encoding different proteins • Two different TFIID-binding sites- two promoters • Differential transcription during the cell cycle Reprinted from: Kim et. al., Nature 436: 876-880 (2005)
The transcriptome of a cell line • Functional relationship between transcription machinery and gene expression • correlated genome-wide expression profiles with PIC promoter occupancy • Four general classes of promoters • Actively transcribed genes • Weakly expressed genes • Weakly PIC bound genes • Inactive genes Reprinted from: Kim et. al., Nature 436: 876-880 (2005)
Genome-Wide Distribution of ORC and MCM Proteins in yeast: High-Resolution Mapping of Replication Origins • Paper presents • Genome-wide location analysis to maptheDNA replication origins in the 16 yeast chromosomes by determining the binding sites of prereplicative complex proteins Wyrick et. al., Science, 294, 2357 (2001)
Chromosome Replication In Eukaryotic Cells • Chromosome replication • initiates from origins of replication distributed along chromosomes • Origins of replication comprise autonomously replicating sequences (ARS) • ARS contain an 11-bp ARS consensus sequence (ACS) • Essential for replication initiation • Recognized by the Origin Recognition Complex (ORC) • The majority of sequence matches to the ACS in the genome do not have ARS activity • Prereplicative complexes at replication origins comprise • Origin Recognition Complex (ORC) proteins • Minichromosome Maintenance (MCM) proteins Reprinted from: Wyrick et. al., Science, 294, 2357 (2001)
Prereplicative Complexes At Origins Of Replication Reprinted from: Stillman, Science, 294, 2301(2001)
High degree of correlation between MCM and ORC binding sites and known ARSs Correct identification of 88% known ARSs The method can accurately identify the position of ARSs to a resolution of 1 kb or less ORC- and MCM-binding sites compared with known ARSs Reprinted from: Wyrick et. al., Science, 294, 2357 (2001)
Genome-wide Location Of Potential Replication Origins Identification of 429 potential origins on the entire genome Reprinted from: Wyrick et. al., Science, 294, 2357 (2001)
Conclusions • The ChIP-based method identified the majority of origins found in the analysis of genome-wide replication timing in yeast • and provides direct, high-resolution mapping of potential origins • Similar approaches identified origins in other organisms • For example: Coordination of replication and transcription along a Drosophila chromosome • MacAlpine et al., Genes & Dev. 18: 3094-3105 (2004) Reprinted from: Wyrick et. al., Science, 294, 2357 (2001)
Functional Mapsor “-omes” Genes or proteins 1 2 3 4 5 n “Conditions” Genes ORFeome Mutational phenotypes Phenome Transcriptome Expression profiles DNA Interactome Protein-DNA interactions Localizome Cellular, tissue location Interactome Protein interactions Proteome proteins After: Vidal M., Cell, 104, 333 (2001)
Global analysis of protein localization in budding yeast Huh et. al., Nature425, 686 - 691(2004) • Paper presents • An approach to define the organization of proteins in the context of cellular compartments involving • the construction and analysis of a collection of yeast strains expressing full-length, chromosomally tagged green fluorescent protein fusion proteins
Experimental Strategy • Systematic tagging of yeast ORFs with green fluorescent protein (GFP) • GFP is fused to the carboxy terminus of each ORF • Full length fusion proteins are expressed from their native promoters and chromosomal location • The collection of yeast strains expressing GFP fusions was analyzed by • fluorescence microscopy to determine the primary subcellular localization of the fusion proteins • Defines 12 categories • co-localization with red fluorescent protein (RFP) markers to refine the subcellular localization • Defines 11 additional categories Reprinted from: Huh et. al., Nature 425, 686 - 691 (2004)
Construction of GFP fusion proteins • For each ORF a pair of PCR primers was designed • Homologous to the chromosomal insertion site • Matching a GFP – selectable marker construct • Yeast was transformed with the PCR products to generate • Strains expressing chromosomally tagged ORFs Reprinted from: Huh et. al., Nature 425, 686 - 691 (2004)
Representative GFP Images Nucleus ER Nuclear periphery Bud neck Lipid particle mitochondrion Reprinted from: Huh et. al., Nature 425, 686 - 691 (2004)
GFP and RFP Co-localization Images Nucleolar marker Reprinted from: Huh et. al., Nature 425, 686 - 691 (2004)
Global results 22 categories • Constructed ~6.000 ORF-GFP fusions • 4.156 had localizable GFP signals (~75% of the yeast proteome) • Good concordance with data from earlier studies • GFP does not affect the location • Localized 70% of the new proteins • Major compartments: cytoplasm (30%) and the nucleus (25%) • 20 other compartments: 44% of the proteins • Most the proteins can be located in discrete cellular compartments Reprinted from: Huh et. al., Nature 425, 686 - 691 (2004)
The proteome of the nucleolus • Detected 164 proteins in the nucleolus • Plus 45 identified in other studies • Data are consistent with MS analysis of human Nucleolar proteins • Allows identification of yeast-human orthologs Reprinted from: Huh et. al., Nature 425, 686 - 691 (2004)
Transcriptional co-regulation and subcellular localization are correlated subcellular localization 33 transcription modules Co-regulated genes Reprinted from: Huh et. al., Nature 425, 686 - 691 (2004)
Conclusion • The high-resolution, high-coverage localization data set • represents 75% of the yeast proteome • classified into 22 distinct subcellular localization categories, • Analysis of these proteins • in the context of transcriptional, genetic, and protein–protein interaction data • provides a comprehensive view of interactions within and between organelles in eukaryotic cells. • helps reveal the logic of transcriptional co-regulation Reprinted from: Huh et. al., Nature 425, 686 - 691 (2004)
Recommended reading • DNA-interactome • Genome-Wide Location of DNA Binding Proteins • Ren et. al., Science, 290, 2306 (2000) • Map of active promoters in the human genome • Kim et. al., Nature 436: 876-880 (2005) • Global analysis of protein localization in yeast • Huh et. al., Nature425, 686 - 691(2004)
Further reading • Genome-Wide Location of DNA Binding Proteins • Genomic Binding Sites of the Yeast Cell-cycle Transcription Factors SBF and MBF • Iyer et al., Nature 409: 533 (2001) • High-Resolution Mapping of Replication Origins • Wyrick et. al., Science, 294, 2357 (2001)