490 likes | 711 Views
SP4: In silico methods. Partner 16A EMBL (Russell, Bork) Partner 1 (CRG Serrano) Partner 5 (NKI Perrakis) Partner 10 (HU Margalit) Partner 12 (CCNet) Partner 17 IRB (Aloy) Partner 3A (Paris-Sud, Janin). SP4 In silico methods.
E N D
SP4: In silico methods Partner 16A EMBL (Russell, Bork) Partner 1 (CRG Serrano) Partner 5 (NKI Perrakis) Partner 10 (HU Margalit) Partner 12 (CCNet) Partner 17 IRB (Aloy) Partner 3A (Paris-Sud, Janin)
SP4 In silico methods • WP4.1: Target identification & annotationPartners: EMBL-Bork/Russell, HU, CCNet, IRB • WP4.2: Complex modelingPartners: EMBL-Russell, IRB, Gif, CRG • WP4.3: Interface to the scientific community & scientific data managementPartners: NKI, EMBL-Russell, CCNet
WP4.1: Target identification & annotationPartners EMBL-HD, HU, CCN, IRB Activities in: • Interaction prediction (HU, EMBL-Bork) • Complex prediction & ranking, the ‘list of 20’ (IRB) • Complex visualisation (CCNet/EMBL) • Data gathering (CCNet) (e.g. protein-chemical interactions) • Gel processing (EMBL)
Complex database, web interface & wiki Matthew Betts (EMBL)
The list of 20 Aloy group (IRB)
Experimental tests on the 20 Aloy, Seraphin, van Tilburgh & Dziembowski groups
WP4.2: Complex modelingPartners EMBL-HD, IRB, Gif, CRG Activities in: • Complex modelling • Automated procedures (EMBL-Russell/IRB) • Interaction prediction via structure (EMBL-Russell/IRB) • New methods for modelling • FoldX (CRG/EMBL) • High-throughput Docking (IRB) • Analyses, individual models (Everybody)
Building a complex from pieces For 636 complexes in yeast 3505 : proteins modelable 419 : complexes single subunit models 224 : 2+ subunit models 122 : 3+ subunit models Aloy et al, Curr. Opin. Struct. Biol, 2005.
Example: MCM complex model agrees with existing EM data Damien Devos (EMBL)
eIF2 /eIF2B complex SUI3 GCD7 GCD2 GCN3 SUI2 GCD6 SUI4 GCD1 SUI4 GCD7 SUI4 SUI3 GCD1 GCN3 Sub-complexes Preliminary reconstructions (Bettina Boettcher) Intact complex MS (Carol Robinson) Modelling (Damien Devos)
Defining new interfaces: 424 candidate interfaces to date Complex 1 Complex 2 Complex 1 Superimposition Superimposition Complex 2 Complex 3 Complex 1,2 & 3 Complex 1 & 2 New complex New complex A common shape denotes a similar fold
Example: Transcription factor SPX dimer New interface E.coli dimer in one protein, forms nice interface in B.subtilis – good evidence from other sources (Myco TAP) Complex 3 Complex 2 Complex 1 Domain 1, 1z3eA.c.47.1.12-1-trans3 (chain A) Transcriptional regulator SPX (B.subtilis) Domain 2, 1z3eA.c.47.1.12-1-trans4 (chain B) Transcriptional regulator SPX (B.subtilis) Domain 3, 1z3eB.a.60.3.1-1-trans1p (chain C) RNA polymerase alpha (B.subtilis) Domain 4, 1z3eB.a.60.3.1-1-trans2p (chain D) RNA polymerase alpha (B.subtilis) Domain 5, 1lb2E.a.60.3.1-1-trans1 (chain E) RNA polymerase alpha (E.coli) Domain 6, 1lb2B.a.60.3.1-1-trans2 (chain F) RNA polymerase alpha (E.coli)
Enabling/disabling loops can predict mulimerization state Monomer S. Cerevisiae Guanylate kinase Homodimer E. Coli Guanylate kinase enabling loop E. Coli Guanylate kinase V. Cholerae Guanylate kinase Yeast Guanylate kinase Mouse Guanylate kinase Pig Guanylate kinase Bovine Guanylate kinase Homodimers Monomers
When modelling fails – docking? • The Aloy group (IRB) is currently running many tens of thousands of docking experiments using Mare nostrum, the largest supercomputer in Europe • Aim is to identify promising docking candidates to help model key interactions of interest
Modelling versus docking • We can model an interaction structure if there is a previously determined structure containing parts homologous to the two interacting proteins homology homology • We can predict an interaction structure by docking if we have structures or models for parts of the interacting proteins
Large-scale Docking 36 million possible yeast protein interactions
Large-scale Docking Unrefined Refined
Using FoldX to assess docked or modelled interactions Good Interaction, but many clashes, Model is not so good but could be rescued By backbone moves/further docking
WP4.3: Interface to the scientific community and scientific data managementPartners: EMBL-HD, NKI, [CCN] Activities in: • Web site maintenance • New data (copy number, structural annotation) • Various optimisation • YeastWiz • Complex target DB • Resting period for software development • Needs data. Listen to Tassos.
WP4.3: Web site Matthew Betts (EMBL)
Interactions of known structure Yeast Wiz (CCNet) www.3drepertoire.org/yeastwiz Windows XP Linux Manual (rather beta) Accounts enabled Monday for everybody Suggestions for new data promising Matthew Betts (EMBL), Tomasz Ignasiak (CCNet)
Mediator complex in yeastwiz (7 proteins from yesterday – two clicks) 3D 3D
Current data contributions EMBL HU STRING (interactions) SMART (orthologues) 3D Interaction predictions Orthology Models Etc. Dom-dom profiles Context interactions Enabling loops 3DR CCN-DB IRB Protein-protein (>10 sources, manual) Protein-chemical (Manual, TM) Docking solutions List of 20
WIKI TargetDB WP4.3: Interface to the scientific community and scientific data management
The exosome Damien Devos (w Carol Robinson)
Analysis of a “gold set” of 61 models of known interactions by FoldX Easy case : Good Interaction Energy, few clashes Good Model
Analysis of a “gold set” of 61 models of known interactions by FoldX Bad Interaction, loads of clashes, interpenetrating mainchains Bad Model
Analysis of a “gold set” of 61 models of known interactions by FoldX Bad Interaction, many clashes, but Model could be rescued by some backbone moves/ further docking
Analysis of a “gold set” of 61 models of known interactions by FoldX Good Interaction, many clashes, interpenetrating mainchains, gaps in the structure Bad Model
Analysis of a “gold set” of 61 models of known interactions by FoldX Easy case : Good Interaction Energy, few clashes : Good Model Good Interaction, many clashes: - interpenetrating mainchains, gaps in the structure : Bad Model - mainchains too close on a large region but this can be solved by backbone moves/further docking Bad Interaction, many clashes - interpenetrating mainchains, gaps in the structure : Bad Model - mainchains too close on a large region but this can be solved by backbone moves/further docking (could improve the model?) The magnitude of the local clashes correlate with the possibility to rescue or not a model (mild clashes on a lot of residues), but still there are exceptions. Could we really skip a step of visualization?
From protein-protein interactions to domain-domain interactions and back Hanah Margalit The Hebrew University of Jerusalem
Modularity in protein-protein interactions fine tuners No No Yes Yes domain pairs protein-protein interactions
positive dataset reliable protein- protein interactions negative dataset reliable pairs of proteins that do not interact What are the fine tuners of domain-domain recognition?
homodimers: • co-localized • co-expressed • interact • monomers: • co-localized • co-expressed • do not interact Homodimers and monomers provide an ideal dataset Domains that mediate homodimerization are found also in monomers Database of 50 homodimers/monomers with the same domain for which structural data is available
P P Interface residue substitutions Different fine-tuners determine theself-interaction potential of domains homodimers Phosphorylations monomers
Enabling loops mediate homodimerization Monomer S. Cerevisiae Guanylate kinase Homodimer E. Coli Guanylate kinase enabling loop E. Coli Guanylate kinase V. Cholerae Guanylate kinase Yeast Guanylate kinase Mouse Guanylate kinase Pig Guanylate kinase Bovine Guanylate kinase Homodimers Monomers
Disabling loops prevent homodimerization Homodimer: Bovine inositol monophosphatase Monomer: Bovine inositol polyphosphate 1-phosphatase DL Monomers Homodimers
protein 1 homodimer monomer protein 2 Presence AND absence are informative Loop profiles A multiple-sequence alignment with locations of potential loops enabling loop disabling loop
The ‘core set’ 64 / 73 are consistent (88%, p-value ≤ 3.2•10-6)
Test set 80 proteins with documented oligomeric statebased on experimental data experimental oligomeric state dimer monomer 3 9 monomer loop profile 63 5 dimer 72/80 are consistent (90%, p-value ≤ 5•10-6)
core test DL DL DL Monomer EL Homodimer 31 monomers >1000 predictable 108 homodimers Large-scale prediction of domain-domain interaction pfkB carbohydrate kinase domain proteins monomer homodimer
Boundary loops There are enabling/disabling loops that are located outside domain boundaries
Dominance of disabling over enabling loops ccrA(B. Fragilis) RNase Z (B. Subtilis) Metallo-beta-lactamase domain
Summary 1. Enabling/disabling loops are newly discovered fine-tuners of domain-domain interaction 2. Their presence/absence is highly preserved in evolution, implying that prevention of unwanted interactions is an evolutionary constraint 3. Prediction of self-interaction potential of domains according to loop profiles is highly accurate (~90%)
Homodimers of multi-domain proteins Heterodimers of proteins with self-interacting domains Heterodimers of proteins with different domains Extension of the analysis