210 likes | 384 Views
- 2011- 3D Structures of Biological Macromolecules Exercise: Structural Comparison of Proteins. Jürgen Sühnel jsuehnel@imb-jena.de. Leibniz Institute for Age Research, Fritz Lipmann Institute, Jena Centre for Bioinformatics Jena / Germany.
E N D
-2011- 3D Structuresof Biological Macromolecules Exercise: Structural Comparison of Proteins Jürgen Sühnel jsuehnel@imb-jena.de Leibniz Institute for Age Research, Fritz Lipmann Institute, Jena Centre for Bioinformatics Jena / Germany Supplementary Material: http://www.fli-leibniz.de/www_bioc/3D/
Quantitative StructuralComparisonof Protein Structures Aim: Generate a set of superposed three-dimensional coordinates for each input structure in such a way that the root-mean-square-deviation for all atom pairs or selected subsets of atom pairs is minimal. Procedures: The most basic approach for the alignment of 3D structures requires a precalculated sequence alignment as input. An especially simple situation occurs if multiple conformations of the same protein are compared. In this case no alignment is necessary, since the sequences are the same. This method traditionally uses a simple least-squares fitting algorithm, in which the optimal rotations and translations are found by minimizing the sum of the squared distances among all structures. Algorithms based on multidimensional rotations and modified quaternions have been developed to identify topological relationships between protein structures without the need for a predetermined alignment.
N Quantitative StructuralComparisonof Protein Structures Root Mean Square Deviation • The RMSD is a measure to quantify structural similarity • Requires 2 superimposed structures (designated here as “a” & “b”) • N = number of atoms being compared RMSD = S (xai - xbi)2+(yai - ybi)2+(zai - zbi)2
Comparing Protein Structures • Two steps: • Identification of a set of related atom pairs • Superposition with minimum RMSD value
Comparing Protein Structures http://www.ruppweb.org/xray/comp/suptext.htm
Comparing Protein Structures http://wishart.biology.ualberta.ca/SuperPose/
Comparing Protein Structures http://www.ebi.ac.uk/Tools/dalilite/
Comparing Protein Structures http://cl.sdsc.edu/
Comparing Protein Structures – SuperPose Server Beginning with an input PDB file or set of files, SuperPosefirst extracts the sequences of all chains in the file(s). Eachsequence pair is then aligned using a Needleman–Wunschpairwise alignment algorithm. If the pairwise sequence identity falls below the defaultthreshold (25%), SuperPose determines the secondary structureusing VADAR (volume, area, dihedral angle reporter) andperforms a secondary structure alignment using a modified Needleman–Wunschalgorithm. After the sequence or secondary structure alignmentis complete, SuperPose then generates a difference distance(DD) matrix between aligned alpha carbon atoms. A differencedistance matrix can be generated by first calculating the distancesbetween all pairs of C atoms in one molecule to generate aninitial distance matrix. A second pairwise distance matrix isgenerated for the second molecule and, for equivalent/alignedCalpha atoms, the two matrices are subtracted from one another, yieldingthe DD matrix. From the DD matrix it is possible to quantitativelyassess the structural similarity/dissimilarity between two structures.In fact, the difference distance method is particularly good at detecting domain or hinge motions in proteins. SuperPose analyzes the DD matrices and identifies thelargest contiguous domain between the two molecules that exhibits<2.0 Å difference. From the information derived fromthe sequence alignment and DD comparison, the program then makesa decision regarding which regions should be superimposed andwhich atoms should be counted in calculating the RMSD. Thisinformation is then fed into the quaternion superposition algorithmand the RMSD calculation subroutine. The quaternion superpositionprogram is written in C and is based on both Kearsley's method and the PDBSUP Fortran program developed by Rupp and Parkin. Quaternions were developed by W. Hamilton (the mathematician/physicist)in 1843 as a convenient way to parameterize rotations in a simple algebraic fashion. Because algebraic expressions are more rapidlycalculable than trigonometric expressions using computers, thequaternion approach is exceedingly fast. SuperPose can calculate both pairwise and multiple structuresuperpositions [using standard hierarchical methods andcan generate a variety of RMSD values for alpha carbons, backboneatoms, heavy atoms and all atoms (average and pairwise). Whenidentical sequences are compared, SuperPose also generates ‘perresidue’ RMSD tables and plots to allow users to identify,assess and view individual residue displacements. http://wishart.biology.ualberta.ca/SuperPose/
Examples Identical/same sequence but different structure Calmodulin: 1A29 vs. 1CLL (open andclosed form) Similarstructure but slightly different sequencelength Thioredoxin: 3TRX vs. 2TRX_a Similarstructure but extremely different sequence Thioredoxin/Glutaredoxin: 3TRX vs. 3GRX_1
Dalton The unified atomic mass unit (symbol: u) or Dalton (symbol: Da) is a unit that is used for indicating mass on an atomic or molecular scale. It is defined as one twelfth of the rest mass of an unbound atom of carbon-12 in its nuclear and electronic ground state,and has a value of 1.660538782(83)×10−27 kg. Da is approximately equal to the mass of 1 proton or 1 neutron.
Dalton Glycine (C2H3ON): M= 2*12 + 3*1 + 16 + 14 = 57 Da
Glutaredoxin InterPro