290 likes | 376 Views
Makromolekulak_2010_12_07 Simon István. Prion protein. Bound IUP structures. Tcf3. p27 Kip1. IA 3. FnBP. Amino sav összetételek. Rövid, hosszú, N- és C- terminális régiókban lévő részeknek más-más aminosav összetételük van. Radivojac et al. Protein Sci. 2004;13:71-80.
E N D
Bound IUP structures Tcf3 p27Kip1 IA3 FnBP
Aminosav összetételek Rövid, hosszú, N- és C- terminális régiókban lévő részeknek más-más aminosav összetételük van Radivojac et al. Protein Sci. 2004;13:71-80.
Prediction of protein disorder from the amino acid sequence • Dunker • order promoting: W, C, F, I, Y, V, L, N • disorder promoting: K, E, P, S, Q, G, R, G, A • 2. Uversky • High net charge/ low average hydrophobicity • Machine learning algorithms • (SVM, NN) • Datasets • PDB for ordered • short and long disorder
Estimation of pairwise energies from amino acid compositions To take into account that the contribution of amino acid i depends on its interaction partners, we need a quadratic form in the amino acid composition The connection between composition and energy is encoded by the 20x20 energy predictor matrix: Pij
Estimated energies correlate with calculated energies Corr coeff: 0.74
Estimated pairwise energies of globular proteins and IUPs IUPs Glob
IUPred: http://iupred.enzim.hu P53 Tumor antigen
IUPs: high frequency in proteomes yeast coli
Barabási-Albert Erdős-Rényi Networks The yeast interactome
Hub proteins contain more disordered regions in all four genomes
Distinct interfaces of disordered proteins • More hydrophobic • More residue-residue contacts • Less segments
Lack of segmentation of the interfaces of IUPs Glob IUPs
LM – average disorder profiles local drop in disorder
Estimate the interaction energy between the residue and its sequential environment A – 10% C – 0% D – 12 % E – 10 % F – 2 % etc… Decide the probability of the residue being disordered based on this Amino acid composition of environ-ment: Predicting protein disorder - IUPred • Basic idea: If a residue is surrounded by other residues such that they cannot form enough favorable contacts, it will not adopt a well defined structureit will be disordered • The algorithm: …..QSDPSVEPPLSQETFSDL WKLLPENNVLSPLPSQAMDDLMLSP D DIEQWFTEDPGPDEAPRMPEAAPRVA PAPAAPTPAAPAPA…..
Amino acid composition of the residue D: Interaction energies: A – 10% C – 0% D – 12 % E – 10 % F – 2 % stb… 97%, that it is disordered Predicting protein disorder - IUPred • Back to p53: E = 1.16 *0.10 + (-0.82) *0 +… The predicted interaction energy: =1.138
Predicting binding sites - ANCHOR • 3 – Interaction with globular proteins We consider the average amino acid composition of a globular dataset instead of the own environment: A – 10% C – 0% D – 12 % E – 10 % F – 2 % stb… A – 7.67% C – 2.43% D – 4.92 % E – 5.43 % F – 3.19 % stb… Composition calculated on a large globular dataset The thus gained energy: where
Predicting binding sites - ANCHOR • Example: N terminal p53 Contains three binding sites: • MDM2: 17-27 • RPA70N: 33-56 • RNAPII: 45-58 The three quantities are combined optimally to best distinguish binding sites. This is converted into a p-value (probability of the residue forming a disordered binding site). P = p1*Saverage+ p2*Eint+ p3*Egain
LONGER DISORDERED CHAINS: p27 – CDK2-CyclinA The predictor was also tested on binding regions longer than 30 amino acids. An example is human p27 that consists of a strongly interacting N-terminal domain (D1, residues 25-36) followed by a linker helix (LH domain, residues 38-60) followed by D2 domain (residues 62-90) containing 3 regions of strong interaction. Figure 4 shows the prediction results and their mapping onto the crystal structure of the complex. The correlation between the prediction score (shown in blue) and the number of atomic contacts (shown in green) is evident showing that the prediction identifies strongly interacting regions only. LONGER DISORDERED CHAINS: p27 – CDK2-CyclinA The predictor was also tested on binding regions longer than 30 amino acids. An example is human p27 that consists of a strongly interacting N-terminal domain (D1, residues 25-36) followed by a linker helix (LH domain, residues 38-60) followed by D2 domain (residues 62-90) containing 3 regions of strong interaction. Figure 4 shows the prediction results and their mapping onto the crystal structure of the complex. The correlation between the prediction score (shown in blue) and the number of atomic contacts (shown in green) is evident showing that the prediction identifies strongly interacting regions only. Figure 4: Prediction output for human p27 (upper) and the identified regions mapped to the structure (lower). CDK2-Cyclin complex is shown is blue, p27 is shown in yellow with the identifi-ed regions shown in red. Figure 4: Prediction output for human p27 (upper) and the identified regions mapped to the structure (lower). CDK2-Cyclin complex is shown is blue, p27 is shown in yellow with the identifi-ed regions shown in red. PDB code: 1jsu PDB code: 1jsu Application: Segmented binding • Example: human p27 • Inhibitor of CDK2-CyclinA complex. • 3 domainsbecome ordered during binding: • D1 binds strongly • LH forms a helix, binds weakly and steers the third domain to place • D2 binds strongly but not evenly – contains 3 subdomains that give the majority of binding energy • We are able to identify strongly interacting regions separately
„Ismeretlen” szekvencia – predikciók PSIPRED ANCHOR
„Ismeretlen” szekvencia – predikciók A modellünk: rendezetlen részek DNS kötő, globuláris domén kötőhely, a-helikális kötőhely, részben a-helikális kötőhely, nincs szerkezeti info A valóság (p53): DNS kötő, globuláris domén MDM2 kötőhely RPA70N és RNAPII kötőhely (átfedőek) tetramerizációs régió, a-helikális regulációs kötőhely, 4 partner (különböző konformációk)