310 likes | 393 Views
Integración de datos evolutivos en la predicción de estructura y función. Alfonso Valencia CNB - CSIC. rcc1. ran. by J.A. G-Ranea. Cluster of solutions 2. Cluster of solutions 1. Solution 3. Solution 4. by J.A. G-Ranea. Ras. Ral. Rho. Ras. Ral. Rho. rcc1. ran. by J.A. G-Ranea.
E N D
Integración de datos evolutivosen la predicción de estructura y función Alfonso Valencia CNB - CSIC
rcc1 ran by J.A. G-Ranea
Cluster of solutions 2 Cluster of solutions 1 Solution 3 Solution 4 by J.A. G-Ranea
Ras Ral Rho Ras Ral Rho rcc1 ran by J.A. G-Ranea
Azuma et al., J,Mol. Biol. 1999
Mapping of mutants (side view) Model H270 GDP D44 Mg++ H304 E157 H410 R206 H78 D128 Green: Km, red: Kcat.
Model and Xray Ran Ran SO4 Mg++ GDP Rcc1 Rcc1 Model Complex
3 13 59 122 Ran (SO4) Rcc1 Ran (GDP-Mg++) Rcc1 Complex: Complex/Docking superposition Model: GRAMM results Model/Docking superposition
Mapping of mutants (side view) Complex (Model on Vomplex superposition) Model GDP H270 Mg++ GDP D44 Mg++ H304 H304 D44 H410 E157 E157 H78 H78 H410 R206 H78 R206 H78 H270 D128 D128 Green: Km, red: Kcat.
Mapping of tree determinants Model Complex
Functional specificity in the ras superfamily Ral Rlip (coiled-coil) Ras Binding Domain folding Ras By J.A. Ranea
Treedeterminants Sequencespace (Casari, Sander, Valencia, Nature. Str. Bio. 95) The space of sequences Ras Ral Rho Ras 36 and 37 main tree-determinant positions Ral Rho Yeast Two-Hybrid Experiments (Bauer et. al. JBC 1999) by J.A. G-Ranea
DnaK Actin Hsc70 MreB Hexokinase FtsA
prot. a prot. b org. 1 multiple sequence alignments (MSA) org. 1 org. 2 org. 2 org. 3 org. 3 org. 4 org. 5 org. 4 Mirror - trees org. 5 In silico 2 hybrid reduced MSAs & implicit trees reduced MSAs intra- and inter-protein correlated mutations protein distance matrices inter-protein intra-protein Caa Cbb Cab correlation values distributions 0.0 r: similarity between a and b trees d1 +1.0 d2 interaction index between a and b
A dimerization model for FtsA Löwe et al., Nature 00 Carettoni, et al., (2002) Phage-display and correlated mutations identify an essential region of subdomain 1C involved in homodimerization of Escherichia coli FtsAProteins
conserved Tree-determinant Correlated mutations Information extracted from multiple sequence alignments Ras Ral Rho
..... ..... ..... Prediction of interaction regions Multiple sequence alignment 5 6 10 20 22 30 A V . . W Y Sequence profiles Interface No aa. 5,6,10,20,22,30 3D str. Surface patch Fariselli, et al., (2002). Prediction of protein-protein interaction sites with neural networks. Eur J Biochem
Prediction of protein interaction sites with neural networks Goal: Predict surface residue in contact with another protein. Method: Feed-forward neural network trained with standard back-propagation. Output layer: Single neuron representing contact / no contact Hidden layer: 4 nodes Input layer: - Patches in protein structures (sets of exposed neighbour residues) 11 residue-long window, central surface residue + 10 patch neighbours - Evolutionary information Residue coded as a vector corresponding to frequencies in the MSA Result: 73% average accuracy of correctly predicted interacting residues Fariselli, Pazos,Valencia, Casadio, Eur. J. Biochem. 02
Results of the predictions of protein interaction sites Fariselli, Pazos,Valencia, Casadio, Eur. J. Biochem. 02
NN predictions of interaction regions A B Carettoni, et al., (2002) Phage-display and correlated mutations identify an essential region of subdomain 1C involved in homodimerization of Escherichia coli FtsAProteins
prot. a prot. b org. 1 multiple sequence alignments (MSA) org. 1 org. 2 org. 2 org. 3 org. 3 org. 4 org. 5 org. 4 Mirror - trees org. 5 In silico 2 hybrid reduced MSAs & implicit trees reduced MSAs intra- and inter-protein correlated mutations protein distance matrices inter-protein intra-protein Caa Cbb Cab correlation values distributions 0.0 r: similarity between a and b trees d1 +1.0 d2 interaction index between a and b
DnaK Actin Hsc70 MreB Hexokinase FtsA
..... ..... ..... Prediction of interaction regions Multiple sequence alignment 5 6 10 20 22 30 A V . . W Y Sequence profiles Interface No aa. 5,6,10,20,22,30 3D str. Surface patch Fariselli, et al., (2002). Prediction of protein-protein interaction sites with neural networks. Eur J Biochem
Prediction of protein interaction sites with neural networks Goal: Predict surface residue in contact with another protein. Method: Feed-forward neural network trained with standard back-propagation. Output layer: Single neuron representing contact / no contact Hidden layer: 4 nodes Input layer: - Patches in protein structures (sets of exposed neighbour residues) 11 residue-long window, central surface residue + 10 patch neighbours - Evolutionary information Residue coded as a vector corresponding to frequencies in the MSA Result: 73% average accuracy of correctly predicted interacting residues Fariselli, Pazos,Valencia, Casadio, Eur. J. Biochem. 02
Results of the predictions of protein interaction sites Fariselli, Pazos,Valencia, Casadio, Eur. J. Biochem. 02
A dimerization model for FtsA Löwe et al., Nature 00 Carettoni, et al., (2002) Phage-display and correlated mutations identify an essential region of subdomain 1C involved in homodimerization of Escherichia coli FtsAProteins
NN predictions of interaction regions A B Carettoni, et al., (2002) Phage-display and correlated mutations identify an essential region of subdomain 1C involved in homodimerization of Escherichia coli FtsAProteins
Carnitine Acyl transferases H327 (catalytic) Palmitoyl-CoA H473 (catalytic) Octanoyl-CoA A332 A478 Malonyl-CoA Malonyl-CoA 2dubE/CPTI/COT/CACP Carnitine Palmitoyltransferase I Carnitine Octanoyltransferase Morillas et al. 2001. J.Biol.Chem. 276, 45001-8, 2001 in the press, and 2002 in the press P. Gomez-Puertas / F. G. Hegardt lab
www.pdg.cnb.uam.es Protein Design Group CNB-CSIC Miguel A. Peñalva CIB-CSIC Fausto Edgarth U. Barcelona Søren Brunak DTU, Copenhagen SH3 B. Distel, Amsterdam Juan Carlos Sanchez TFs in Aspergilus Manuela Helmer-Citterich U. TorVergata, SANITAS Miguel Vicente CNB - CSIC Amalia Muñoz binding specificity system Paulino Gómez Puertas Bacterial Cell Division Protein Structure Prediction Rita Casadio Pierro Fariselli U. Bologna Osvaldo Graña Threading / servers David de Juan Protein interactions Manuel Gómez Bacterial Cell division II EC V FP Ramón Roca Homology modeling RAS E. Laue, Cambridge F. Wittinghoer, MPI Michael Tress Threading Javier Guijarro Interaction networks Burkhard Rost Columbia U., NY Juan Antonio G. Ranea ras signalling system Luis Sánchez-Pulido Sequence analysis REGIA J. Pazares, CNB-CSIC Genome analysis Damien Devos Function prediction at genomic scale José María Fernandéz Database design Christos Ouzounis EBI-EMBL, Cambridge Ramón Alonso Arabidopsis T. factors Protein Design Systems Julio Collado-Vides UNAM, Cuernavaca. Armando Amat System management TEMBLOR EBI EC VFP Federico Abascal Function prediction Information Extraction Antoine de Daruvar Lion, Heidelberg. CICYT biotech Roderic Guigo IMIM, Barna Florencio Pazos Interaction networks Christian Blaschke Information extraction Centro de Astrobiología DNA arrays Joaquin Dopazo Javier Herrero CNIO María González Genomics, Ecology CICYT FEDER Juan Carlos Oliveros Array data management Eduardo Leon Detection of protein names ORIEL EMB0 EC VFP Ugo Bastolla Protein stability Keith Harshman, Carlos Martínez D I O - C N B ALMA Bioinformática S.L. Andrés Moya, Roeland van Ham U. Valencia Robert Hoffmann I.E. Biomedicine Javier Tamames Genome analysis Prof. Hans Robert Kalbitzer U. Regensburg Enrique Merino Centro Bioitecnología UNAM, Cuernavaca Bioinformatics support Information extraction Visitors
Text mining Biological examples Protein interactions