440 likes | 593 Views
Structural prediction of protein assemblies. Guilhem FAURE. Supervisor : Raphaël Guérois. Molecular Assemblies and Signaling Structural Biology and Radiobiology Lab iBiTecS – URA CNRS 2096 - CEA Saclay. Experimental insights into the protein interactions space ?. High throughput approaches.
E N D
Structural prediction of protein assemblies Guilhem FAURE Supervisor : Raphaël Guérois Molecular Assemblies and Signaling Structural Biology and Radiobiology Lab iBiTecS – URA CNRS 2096 - CEA Saclay
Experimental insights into the protein interactions space ? High throughput approaches High resolution approches Macromolecules in cellulo Large scale vision Synergies/Competitions Molecular vision
Translate each node of the interaction networks into a 3D structure ? Experimental structures Homology models ? How to model the structure of proteins/domains assemblies ?
104 decoys Filters ~ 10 decoys 1 most likely model How to predict protein assemblies ? • Surface complementarities • Physico-chemestry features • Evolution data
Thesis Goals Use evolution data to predict protein assemblies How to characterize evolution ? Conservation ? Coevolution ? type of data to analyse ? 104 decoys Filters ~ 10 decoys How to use evolution to predict ? 1 most likely model
Ratio of conserved residues part of a given interface ~ 30 % protein % of complexes interface % of all conserved residues Can conservation leads protein assemblies ? Interface conservation Complex A-B = conserved AB interface ? Lack of specificity to predict
Evolutionary rates as relevant interface signals ? Lif1 S. cerevisiae XRCC4 H. sapiens (low sequence identity) Xray structure known at 2.4A Xray structure known at 2.3A Nej1 S. cerevisiae Cernunnos H. sapiens (low sequence identity)
Evolutionary rates as relevant interface signals ? An example from the DNA repair interaction network ? Lif1 S. cerevisiae XRCC4 H. sapiens BRCT DNA ligase ? ? ? Nej1 S. cerevisiae Cernunnos H. sapiens conservation
Rosetta Score (min vs all) -10 Step 1 Interface Energy Filter solutions using evolutionnary rates -20 -30 iRMS Step 2 Local perturbations, Optimisations of the interactions … search for funnels 2 4 6 8 10 12 14 An Example of Prediction with XRCC4-Cernunnos Exploiting Evolution and Energy Calculations Coll. JB Charbonnier (LBSR) G. Faure in Malivert et al, JBC (2010)
Model gives many precious information Interface mutations can be design to study the complex Model can lead the resolving of Xray structure But without biochemestry information about BRCT hard to predict Need mutual information coevolution / coadaptation
Deleterious mutation How do deleterious mutations at the interface can be tolerated ? S. cerevisiae : complementary interactions - charge compensation - polar interactions - apolar interactions … Euk. sup. • Neighbouring positions can buffer the loss of complementarity • Other mechanisms of co-evolution ? • How to account for structural plasticity ? Madaoui & Guerois, PNAS 2008
How to study coevolution : concept of interology Same interaction involving same partners = INTEROLOGS Same interface Same ancestor = homolog Same evolution profil + same fold
How to build an interolog database ? • Extracting and cleaning heterocomplex • True heteromer • biological interfaces • … Redundancy traitement 2500 Non redundant interfaces 2500 groups of interologs 350 groups of structural interologs G. Faure et al, in prep.
How to explore coevolution ? A PyMol plugin to visualize Structure and alignments Data and Querying Server at http://biodev.extra.cea.fr/lbsr/
Conclusion & Perspecpives • Conservation can not be used to predict protein assemblies • Building a large database • Large spectrum of sequence divergence • Explore structural plasticity at complex interfaces • while increasing sequence divergence • Test our ability to reproduce this plasticity • Analyze the evolution of hot-spot regions Benchmark to address how far structural models can be used in modelling protein complexes Developpement of statistical potential taking account evolution data
InterEvol : Automatic and self-updating interface database for extracting structural and evolutionary information XXX heteromeric complexes Clustering Families & Superfamilies Biological vs non biological interfaces Redundancy filters XXX non redundant interfaces HHsearch Matras NoXclass XXX structural interologs Coupled alignments for orthologous sequences for both partners Pymol plugin for interface coevolution visualisation Querying Server at http://biodev.extra.cea.fr/lbsr/
How to study coevolution ? Querying Server at http://biodev.extra.cea.fr/lbsr/
How to find coevolution ? An interolog structural databank (350 groups of interologs) same fold + same evolutif profil + same interaction area G. Faure et al, in prep.
How to predict protein assemblies with coevolution ? Multi-body potential Interologs database (350 groups of interologs) Interface database (2500 interfaces) InterAlign database (2500 alignments) Exploring base Learning base
Which evolutionary signals at protein surfaces can be captured to identify the interaction sites ? Conservation analyses HSM3 RPN1 RPT1 RPT5 RPT2 conservation score conservation
Which ratio of conserved residues are part of the interface ? Protein A Protein B % of complexes AB interface % of all conserved residues Evolutionary rates do not provide mutual information between interacting surfaces … • How to account for co-evolution or co-adaptation • Can this helps to better predict molecular assemblies protein interface
Evolutionary rates do not provide mutual information between interacting surfaces …
90° 90° i A/B complex k j Hydrophobic Polar Acidic Basic Co-adaptation involve not only pairs of residues but also groups of structural neighbours Protein A Protein B k i j Human Mouse Structural Neighbours may compensate for loss of complementarity Fish … Yeast Madaoui & Guerois, PNAS 2008
Co-variation analyses at the interface of intra-molecular domain-domain interactions Partner B Partner A Protein A Protein B AB interface Human Mouse Fish … Yeast
An Example of Prediction Exploiting Evolution DNA repair complex (Non-homologous End Joining) Coll. JB Charbonnier (LBSR) Conserved Residues Conserved Residues Docking under constrains with Haddock (Bonvin’s group) G. Faure in Malivert et al, JBC (2010)
The evolutionary dimension should provide key information to exploit interaction data under a structural perspective
2 majors issues Difficulties to identify orthologs How to characterize selection pressure at the interface
2 majors issues Difficulties to identify orthologs How to characterize selection pressure at the interface
InterEvol: The R-evolutionary databank A non redundant heterodimer structures databank (2300 structures) Study the contact statistics at the interface Graph répartition transient permanent taille interface G. Faure et al, in prep. (1) Krissinel and K. Henrick
InterEvol: The R-evolutionary databank An interolog structural databank (350 structures) A B same fold + Same evolutif profil B’ A’ Rajouter les % id G. Faure et al, in prep. (1) Krissinel and K. Henrick
InterEvol: The R-evolutionary databank An interolog sequence databank (2300 alignments) Initial structure Sequences from PSIBLAST at least 30% of identity … G. Faure et al, in prep.
InterEvol: The R-evolutionary databank PISA 1 (PDB complex assemblies) G. Faure et al, in prep. (1) Krissinel and K. Henrick
InterEvol: The R-evolutionary databank PISA 1 (PDB complex assemblies) Cleaned true heteromer G. Faure et al, in prep. (1) Krissinel and K. Henrick
InterEvol: The R-evolutionary databank PISA 1 (PDB complex assemblies) Cleaned true heteromer Non redundant PDB structures databank G. Faure et al, in prep. (1) Krissinel and K. Henrick
InterEvol: The R-evolutionary databank PISA 1 (PDB complex assemblies) Cleaned true heteromer Non redundant PDB structures databank Non redundant heterodimer databank SCOTCHAlign databank G. Faure et al, in prep. (1) Krissinel and K. Henrick
InterEvol: The R-evolutionary databank PISA 1 (PDB complex assemblies) Cleaned true heteromer Non redundant PDB structures databank Non redundant heterodimer databank SCOTCHAlign databank Interolog databank G. Faure et al, in prep. (1) Krissinel and K. Henrick
Through multidimensionnal data: InterEvolVisu Photo du plugin sur un exemple G. Faure et al, in prep. (1) Krissinel and K. Henrick
Conclusions & Perspectives • Build a statistical multicore potential from structure and sequence data • Understand the pressure selection at the interface with Interologs • Build a full leading Docking method to automise each steps (1) Krissinel and K. Henrick
Which ratio of conserved residues are part of the interface ? % of complexes % of all conserved residues Conservation analyses at the interface of intra-molecular domain-domain interactions protein interface Several approaches combined conservation with other structure and sequence features to identify potential binding patches no mutual information (ProMate (Neuvirth, JMB, 2004), PINUP (Liang et al, NAR, 2006), SPPIDER (Porollo, Proteins, 2007))
Which evolutionary signals at protein surfaces can be captured to identify the interaction sites ? Conservation analyses HSM3 RPN1 RPT1 RPT5 RPT2 conservation score conservation
Relationships between sequence divergence and conservation of the binding mode B A AB Complex Human + ~ > 30 % identity A’B’ Complex Yeast + B’ A’ Two homologous complexes (~> 30% identity) generally interact in a similar manner Evolution data gives information about structure assemblies Aloy & Russel, JMB 2003