350 likes | 538 Views
Protein Networks / Protein Complexes. Protein networks could be defined in a number of ways (1) Co-regulated expression of genes/proteins (2) Proteins participating in the same metabolic pathways (3) Proteins sharing substrates (4) Proteins that are co-localized
E N D
Protein Networks / Protein Complexes • Protein networks could be defined in a number of ways • (1) Co-regulated expression of genes/proteins • (2) Proteins participating in the same metabolic pathways • (3) Proteins sharing substrates • (4) Proteins that are co-localized • (5) Proteins that form permanent supracomplexes = „protein machines“ • (6) Proteins that bind eachother transiently • (signal transduction, bioenergetics ... ) • In the next weeks, we will consider direct interactions (5) and (6). Bioinformatics III
Structural Techniques Russell et al. Curr. Opin. Struct. Biol. 14, 313 (2004) Bioinformatics III
Potential pitfalls Russell et al. Curr. Opin. Struct. Biol. 14, 313 (2004) Bioinformatics III
Hybrid models: docking X-ray structures into EM maps Russell et al. Curr. Opin. Struct. Biol. 14, 313 (2004) Bioinformatics III
A biological cell: a large construction site? In a biological cell there are many tasks that need to be executed in a timely and precise manner. Job office publishes lists (DNA) of people looking for jobs (protein). Managers from the personnel office (DNA-transcription factors) recruit (express) proteins. Workers (proteins) need to get to their working places (localization). During work they get energy from drinking beer (ATP). All steps depend on interaction of proteins with DNA or with other proteins! Bioinformatics III
1 Protein-Protein Complexes It has been realized for quite some time that cells don‘t work by random diffusion of proteins, but require a delicate structural organization into large protein complexes. Which complexes do you know? Bioinformatics III
RNA Polymerase II RNA polymerase II is the central enzyme of gene expression and synthesizes all messenger RNA in eukaryotes. Cramer et al., Science 288, 640 (2000) Bioinformatics III
RNA processing: splicesome Structure of a cellular editor that "cuts and pastes" the first draft of RNA straight after it is formed from its DNA template. It has two distinct, unequal halves surrounding a tunnel. The larger part appears to contain proteins and the short segments of RNA, while the smaller half is made up of proteins alone. On one side, the tunnel opens up into a cavity, which the researchers think functions as a holding space for the fragile RNA waiting to be processed in the tunnel itself. Profs. Ruth and Joseph Sperling http://www.weizmann.ac.il/ Bioinformatics III
Protein synthesis: ribosome The ribosome is a complex subcellular particle composed of protein and RNA. It is the site of protein synthesis, http://www.millerandlevine.com/chapter/12/cryo-em.html Model of a ribosome with a newly manufactured protein (multicolored beads) exiting on the right. Bioinformatics III
Signal recognition particle Cotranslational translocation of proteins across or into membranes is a vital process in all kingdoms of life. It requires that the translating ribosome be targeted to the membrane by the signal recognition particle (SRP), an evolutionarily conserved ribonucleoprotein particle. SRP recognizes signal sequences of nascent protein chains emerging from the ribosome. Subsequent binding of SRP leads to a pause in peptide elongation and to the ribosome docking to the membrane-bound SRP receptor. SRP shows 3 main activities in the process of cotranslational targeting: first, it binds to signal sequences emerging from the translating ribosome; second, it pauses peptide elongation; and third, it promotes protein translocation by docking to the membrane-bound SRP receptor and transferring the ribosome nascent chain complex (RNC) to the protein-conducting channel. 40S small ribosomal subunit (yellow) 60S large ribosomal subunit (blue), P-site tRNA (green), SRP (red). Halic et al. Nature 427, 808 (2004) Bioinformatics III
Nuclear Pore Complex A three-dimensional image of the nuclear pore complex (NPC), revealed by electron microscopy. A-B The NPC in yeast. Figure A shows the NPC seen from the cytoplasm while figure B displays a side view. C-D The NPC in vertebrate (Xenopus). http://www.nobel.se/medicine/educational/dna/a/transport/ncp_em1.html Three-Dimensional Architecture of the Isolated Yeast Nuclear Pore Complex: Functional and Evolutionary Implications, Qing Yang, Michael P. Rout and Christopher W. Akey. Molecular Cell, 1:223-234, 1998 NPC is a 50-100 MDa protein assembly that regulates and controls trafficking of macromolecules through the nuclear envelope. Bioinformatics III
GroEL: a chaperone to assist misfolded proteins Schematic Diagram of GroEL Functional States (a) Nonnative polypeptide substrate (wavy black line) binds to an open GroEL ring. (b) ATP binding to GroEL alters its conformation, weakens the binding of substrate, and permits the binding of GroES to the ATP-bound ring. (c) The substrate is released from its binding sites and trapped inside the cavity formed by GroES binding. (d) Following encapsulation, the substrate folds in the cavity and ATP is hydrolysed. (e) After hydrolysis in the upper, GroES-bound ring, ATP and a second nonnative polypeptide bind to the lower ring, discharging ligands from the upper ring and initiating new GroES binding to the lower ring (f) to form a new folding active complex on the lower ring and complete the cycle. http://people.cryst.bbk.ac.uk/~ubcg16z/chaperone.html Ransom et al., Cell 107, 869 (2001) Bioinformatics III
Arp2/3 complex The seven-subunit Arp2/3 complex choreographs the formation of branched actin networks at the leading edge of migrating cells. (A) Model of actin filament branches mediated by Acanthamoeba Arp2/3 complex.(D) Density representations of the models of actin-bound (green) and the free, WA-activated (as shown in Fig. 1D, gray) Arp2/3 complex. Volkmann et al., Science 293, 2456 (2001) Bioinformatics III
proteasome The proteasome is the central enzyme of non-lysosomal proteindegradation. It is involved in the degradation of misfolded proteins as well as in the degradation and processing of short lived regulatory proteins.The 20S Proteasome degrades completely unfoleded proteins into peptides with a narrow length distribution of 7 to 13 amino acids. http://www.biochem.mpg.de/xray/projects/hubome/images/rpr.gif Löwe, J., Stock, D., Jap, B., Zwickl, P., Baumeister, W. and Huber, R. (1995). Crystal structure of the 20S proteasome from the archaeon T. acidophilum at 3.4 Å resolution. Science268, 533-539. Bioinformatics III
Energy conversion: Photosynthetic Unit • Other large complexes: • - Apoptosome • Thermosome • Transcriptome Structure suggested by force field based molecular docking. http://www.ks.uiuc.edu/Research/vmd/gallery Bioinformatics III
icosahedral pyruvate dehydrogenase complex: a multifunctional catalytic machine Model for active-site coupling in the E1E2 complex. 3 E1 tetramers (purple) are shown located above the corresponding trimer of E2 catalytic domains in the icosahedral core. Three full-length E2 molecules are shown, colored red, green and yellow. The lipoyl domain of each E2 molecule shuttles between the active sites of E1 and those of E2. The lipoyl domain of the red E2 is shown attached to an E1 active site. The yellow and green lipoyl domains of the other E2 molecules are shown in intermediate positions in the annular region between the core and the outer E1 layer. Selected E1 and E2 active sites are shown as white ovals, although the lipoyl domain can reach additional sites in the complex. Milne et al., EMBO J. 21, 5587 (2002) Bioinformatics III
Apoptosome (A) Top view of the apoptosome along the 7-fold symmetry axis. (B) Details of the spoke. (C) A side view of the apoptosome reveals the unusual axial ratio of this particle. The scale bar is 100 Å. (D) An oblique bottom view shows the puckered shape of the particle. The arms are bent at an elbow (see asterisk) located proximal to the hub. Acehan et al. Mol. Cell 9, 423 (2002) Apoptosis is the dominant form of programmed cell death during embryonic development and normal tissue turnover. In addition, apoptosis is upregulated in diseases such as AIDS, and neurodegenerative disorders, while it is downregulated in certain cancers. In apoptosis, death signals are transduced by biochemical pathways to activate caspases, a group of proteases that utilize cysteine at their active sites to cleave specific proteins at aspartate residues. The proteolysis of these critical proteins then initiates cellular events that include chromatin degradation into nucleosomes and organelle destruction. These steps prepare apoptotic cells for phagocytosis and result in the efficient recycling of biochemical resources. In many cases, apoptotic signals are transmitted to mitochondria, which act as integrators of cell death because both effector and regulatory molecules converge at this organelle. Apoptosis mediated by mitochondria requires the release of cytochrome c into the cytosol through a process that may involve the formation of specific pores or rupture of the outer membrane. Cytochrome c binds to Apaf-1 and in the presence of dATP/ATP promotes assembly of the apoptosome. This large protein complex then binds and activates procaspase-9. Bioinformatics III
Future? • Structural genomics (X-ray) may soon generate enough templates of individal folds. • Structural genomics may be expanded to protein complexes. • Interactions between proteins of the same fold tend to be similar when the sequence identity is above approximately 30% (Aloy et al.). • Hybrid modelling of X-ray/EM will not be able to answer all questions • problem of induced fit • transient complexes cannot be addressed by these techniques • Essential to combine large variety of hybrid + complementary methods • Russell et al. Curr. Opin. Struct. Biol. 14, 313 (2004) Bioinformatics III
2 Information on protein-protein networks Bioinformatics III
2. Yeast 2-Hybrid Screen Data on protein-protein interactions from Yeast 2-Hybrid Screen. One role of bioinformatics is to sort the data. Bioinformatics III
Protein cluster in yeast Cluster-algorithm generates one large cluster for proteins interacting with each other based on binding data of yeast proteins. Schwikowski, Uetz, Fields, Nature Biotech. 18, 1257 (2001) Bioinformatics III
Annotation of function After functional annotation: connect clusters of interacting proteins. Schwikowski, Uetz, Fields, Nature Biotech. 18, 1257 (2001) Bioinformatics III
Annotation of localization Schwikowski, Uetz, Fields, Nature Biotech. 18, 1257 (2001) Bioinformatics III
3 Systematic identification of protein complexes Bioinformatics III
Systematic identication of large protein complexes Yeast 2-Hybrid-method can only identify binary complexes. Cellzome company: attach additional protein P to particular protein Pi , P binds to matrix of purification column. yieldsPi and proteins Pk bound to Pi . Identify proteins by mass spectro- metry (MALDI- TOF). Gavin et al. Nature 415, 141 (2002) Bioinformatics III
Analyis of protein complexes in yeast (S. cerevisae) Identify proteins by scanning yeast protein database for protein composed of fragments of suitable mass. Here, the identified proteins are listed according to their localization (a). (b) lists the number of proteins per complex. Gavin et al. Nature 415, 141 (2002) Bioinformatics III
Validation of methodology Check of the method: can the same complex be obtained for different choice of attachment point (tag protein attached to different coponents of complex)? Yes (see gel). Method allows to identify components of complex, not the binding interfaces. Better for identification of interfaces: Yeast 2-hybrid screen (binary interactions). 3D models of complexes are important to develop inhibitors. • theoretical methods (docking) • electron tomography Gavin et al. Nature 415, 141 (2002) Bioinformatics III
Network of protein complexes? Service function of Bioinformatics: catalog such data and prepare for analysis ... allowing to formulate new models and concepts (biology!). If results are very important don‘t wait for some biologist to interpret your data. You may want to get the credit yourself. Modularity = Formation of separated Islands ?? Gavin et al. Nature 415, 141 (2002) Bioinformatics III
4 Aim: generate structures of protein complexes • Experiment • Start from 232 purified complexes from TAP strategy. • Select 102 that gave samples most promising for EM from analysis of gels and protein concentrations. • Take EM images. • Theory • Make list of components. • Assign known structures of individual proteins. • Assign templates of complexes • If complex structure available for this pair • if complex structure available for homologous protein • if complex structure available for structurally similar protein (SCOP) Bettina Böttcher (EM) Rob Russell (Bioinformatics) Bioinformatics III
How transferable are interactions? interaction similariy (iRMSD) vs. % sequence identity for all the available pairs of interacting domains with known 3D structure. Curve shows 80% percentile (i.e. 80% of the data lies below the curve), and points below the line (iRMSD = 10 Å) are similar in interaction. Aloy et al. Science, 303, 2026 (2004) Bioinformatics III
Bioinformatics Strategy Illustration of the methods and concepts used. How predictions are made within complexes (circles) and between them (cross-talk). Bottom right shows two binary interactions combined into a three-component model Aloy et al. Science, 303, 2026 (2004) Bioinformatics III
3SOM algorithm: vector-based circumference superimposition A 2D variant of the 3D vector-based surface superimposition that is central to the 3SOM algorithm. For each tested voxel a on the circumference of the target, a vector va is calculated that approximates the normal vector orthogonal to the tangent line in a and with origin in a. Vector va is superimposed on each vector vb that is associated with a voxel b on the circumference of the template. The goodness-of-fit of the transformation in question is assessed by measuring the circumference overlap, the fraction of target circumference voxels that is projected onto (or near) the template circumference (triangles). In 3D, a rotational degree of freedom is left around the superimposed vectors, which is sampled in rotational steps of 9°. Ceulemans, Russell J. Mol. Biol., 338, 783 (2004) Bioinformatics III
Successful models of yeast complexes (A) Exosome model on PNPase fit into EM map. (B) RNA polymerase II with RPB4 (green)/RPB7 (red) built on Methanococcus jannaschii equivalents, and SPT5/pol II (cyan) built with IF5A. (C and D) Views of CCT (gold) and phosphoducin 2/VID27 (red) fit into EM map. (E) Micrograph of POP complex, with particle types highlighted. (F) Ski complex built by combination of two complexes. Aloy et al. Science, 303, 2026 (2004) Bioinformatics III
Cross talk between complexes (Top) Triangles show components with at least one modelable structure and interaction; squares, structure only; circles, others. Lines show predicted interactions: thick lines imply a conserved interaction interface; red, those supported by experiment. (Bottom) Expanded view of cross-talk between transcription complexes built on by a combination of two complexes. Aloy et al. Science, 303, 2026 (2004) Bioinformatics III
Summary A combination of 3D structure and protein-interaction data can already provide a partial view of complex cellular structures. The structure-based network derived from cross-talk between complexes provides a more realistic picture than those derived blindly from interaction data, because it suggests molecular details for how they are mediated. Of course, the picture is still far from complete and there are numerous new challenges. The structure-based network derived here provides a useful initial framework for further studies. Its beauty is that the whole is greater than the sum of its parts: Each new structure can help to understand multiple interactions. The complex predictions and the associated network will thus improve exponentially as the numbers of structures and interactions increase, providing an ever more complete molecular anatomy of the cell. Aloy et al. Science, 303, 2026 (2004) Bioinformatics III