180 likes | 400 Views
A Pair-Based Approach to Structural Homology Using Quaternion SLERP Averaging and Local Rotations. Lawrence C. Andrews * and Herbert J. Bernstein #. * Micro Encoder Inc., Kirkland, WA, USA, larrya@microen.com # Dowling College, Oakdale, NY, USA, yaya@dowling.edu
E N D
A Pair-Based Approach to Structural Homology Using Quaternion SLERP Averaging and Local Rotations Lawrence C. Andrews* and Herbert J. Bernstein# *Micro Encoder Inc., Kirkland, WA, USA, larrya@microen.com #Dowling College, Oakdale, NY, USA, yaya@dowling.edu work supported in part by grants from NIGMS and DOE Andrews, Bernstein: Local Rotations
Introduction We present an atom-pair-based alternative to the Kabsch algorithm [Kabsch 1978] for measuring structural homology between commensurate molecular fragments. Our objective is able to reproduce the results of the Kabsch algorithm for fragments that can be mapped into each other by a single rigid body motion. In addition, our pair-based algorithm is able to illuminate cases in which different rigid body motions are required to bring various substructures into alignment. The Kabsch algorithm is the gold standard for measuring structural homology between two molecular fragments with the same numbers of atoms and with the same connectivity in each of two fragments. The Kabsch algorithm is based on first centering each of the two fragments on their centroids and then computing the covariance matrix of the first fragment against the other. Our pair-based approach takes pairs of atoms, one atom from the first fragment and the matching atom from the second fragment, and computes the plane dividing the line between them. Then taking an appropriate sampling of pairs of pairs yields an axis and, most importantly, an angle of rotation from the intersections of the planes separating the pairs of atoms. Andrews, Bernstein: Local Rotations
Each axis plus an angle determines a quaternion. If the quaternions are averaged using Spherical Linear Interpolation (SLERP), the result agrees with the results of the Kabsch algorithm. As noted by Ye and Godzik [Ye Godzik 2003], “When flexible molecules in different conformations are compared to each other as rigid bodies, even strong structural similarities can be missed and significant errors in alignments can occur because such algorithm compensate global rearrangements with local alignment shifts.” Their “FATCAT” algorithm works in terms of rigid body movements of fragments. Our pair-based algorithm can be used in a similar manner, but it is not dependent on the identification of fragments, nor on the identification of translations, but instead going directly to the identification of the rotations needed. Andrews, Bernstein: Local Rotations
The Kabsch Algorithm Start with two commensurate molecules, M1 = {m1,i}, M2 = {m2,i}, with the same number of atoms (or residues) in the same order. Goal: Find a linear transformation, T, such that T(M1) is as close to M2 as possible Method: translate both structures to their center of mass, and compute the rotation matrix R that minimizes the RMSD between coordinates from the covariance matrix For invertible A the rotation matrix may be computed directly as For the general case, a singular value decomposition of A is commonly used possibly still needing to change from a left-handed to a right-handed system. See http://boscoh.com/protein/rmsd-root-mean-square-deviation. See [Kavraki 2007] for a general discussion of alternatives for measuring the distances between molecules. Andrews, Bernstein: Local Rotations
Characteristics of the Kabsch Algorithm Fast – linear in the number of atoms Global – does the entire molecule in one calculation Accurate – gives the best fit translation and rotation But .. Opaque – no residue-by-residue analysis Insensitive to local features – averages them out Unable to work with incommensurate structures and substructures Solution: Fit locally and assemble overall rotation from local rotations See the fit while tracing chains Andrews, Bernstein: Local Rotations
Local Rotation for 2 Pairs of Matching Atoms Intersection of bisecting planes defines the direction of the axis of rotation Actual axis of rotation at the common origin parallel to the intersection of the bisecting planes Each pair of atoms defines a bisecting plane Common origin for all rotations Andrews, Bernstein: Local Rotations
Special Cases For the case of just one atom pair (the “sequence-0” case), there is only one bisecting plane, and the axis is in the direction of the cross-product of the vectors from the center of mass to the two atoms of the pair. If the two bisecting planes for the two pairs of atoms are parallel or antiparallel, there is no intersection to use as an axis direction. In that case we apply Andrews’ vector “pair” algorithm (http://vector.sf.net), computing the rotation needed to bring the atom of the first pair along the same axis and then the rotation around that axis to bring the atoms of the second pair as close as possible. This produces the needed axis. This is a rare case in real molecules but has been handled for completeness. When the two planes do intersect, the direction of the axis might be inverted. To resolve that ambiguity, the direction closest to the direction of the cross-product of the vectors to the midpoints of the lines between the representatives of the two pairs in each molecule is used to disambiguate the direction. Andrews, Bernstein: Local Rotations
Quaternions and SLERP A rotation may be represented equally well as a rotation axis and angle, as a 3 x 3 matrix or as a quaternion [Hamilton 1844], but quaternions are computationally most efficient and much easier to combine [Shoemake 1985]. A rotation around a unit vector axis [x, y, z] by angle α is equivalent to the quaternion: where i, j and k are square roots of -1 defined so that ij = k = -ji, jk = i = -kj, ki = j = -ik. Two quaternions can be combined smoothly with Spherical Linear Interpolation (SLERP): For combining local rotations, it is important to catch equivalent rotations with opposing axes before combining them and then invert one axis, resulting in a Hemispherical Linear Interpolation (HLERP). Andrews, Bernstein: Local Rotations
Earlier Use of Quaternions Diamond [Diamond 1988] used what is essentially the Caley representation of a quaternion for the desired rotation, dividing by the sin term, and omitting the resulting constant 1: which works except at 180 degrees. Horn [Horn 1987] independently proposed a true quaternion representation for the desired rotation. In both Diamond’s and Horn’s approaches, the 3 x 3 matrix of the Kabsch algorithm is replaced by a 4 x 4 matrix, but the use of quaternions guarantees proper rotations. Kearsley [Kearsley 1989] adopted a similar quaternion-based approach, functionally equivalent to Diamond’s. Theobald [Theobald 2005] showed that there are numerical problems with Diamond’s approach, producing incorrect RMSDs in some cases and proposed a fast, numerically stable approach to calculating the RMSD. The derivation in Horn’s paper suggests the possibility of looking at the rotations three atoms at a time but does not pursue that thread of research. We follow that concept, working only two atoms at a time, and, by combining quaternions by SLERP rather than the simple matrix summation in Horn, achieve a close approximation to Kabsch. Jmol implements Horn’s quaternion-based algorithm and offers a different quaternion-based orientation fit [Hanson 2009]. Andrews, Bernstein: Local Rotations
Combining Local Rotations for Commensurate Molecules For commensurate molecules, a reasonably accurate approximation to the results of the Kabsch algorithm can be achieved by tracing the chain, calculating local rotations as quaternions for the Cα from a few (1 - 3) residues on each side of each Cα and combining them with HLERP. For example, if we take the matching Cα from PDB entries 3I82 and 3I87 [Tanaka Sawaya Yeates 2010], the Kabsch algorithm achieves a fit with an RMSD of 3.10 Å. The local rotation fit tracing the chain with a window of one residue above and below achieves essentially the same fit with an RMSD of 3.11 Å. More remarkable is that essentially the same result can be achieved by using non-local rotations, matching residues to residues far from the target residue. For commensurate structures the fit is surprisingly stable. Andrews, Bernstein: Local Rotations
Comparison of Residue-by-Residue Distances for Kabsch vs. Local Rotation Comparison of 3I87 aligned to 3I82 by the Kabsch algorithm versus our local rotation algorithm with sequences ranges of 0 to 5 residues. Distance in Ångstroms Residue Number Andrews, Bernstein: Local Rotations
Extending Local Rotations to Incommensurate Structures As seen in commensurate structures, local rotations provide strong indicators of deviations from overall rotational alignment, suggesting use in clustering to identify alignable substructures. Populations of local rotations in 15 degree bands of all pairs nearest to geodesic points on sphere of rotation axes comparing 3I82 and 3I87. Z is towards the user, Y is up. Note the clusters in the 15-45 degree range for axes both to the NW and ENE Andrews, Bernstein: Local Rotations
Relationship to FATCAT FATCAT [Ye Godzik 2003] is a “Flexible structure AlignmenT by Chaining AFPs (Aligned Fragment Pairs) with Twists (FATCAT), a new method for structural alignment of proteins. The FATCAT approach simultaneously addresses the two major goals of flexible structure alignment: optimizing the alignment and minimizing the number of rigid-body movements (twists) around pivot points (hinges) introduced in the reference protein. In contrast, currently existing flexible structure alignment programs treat the hinge detection as a post-process of a standard rigid body alignment.” It is an important tool in understanding structural homology and in forming hypotheses about the relationships among structurally distinct proteins. Our local rotation algorithm shows promise as a possible next step after FATCAT processing for understanding more of the detail of changes in morphology and in suggesting strong relationships in morphology among protein fragments that are not related by simple twists and pivots, using similar patterns of local rotation quaternions as signatures rather than just relying on patterns of Cα distances. Andrews, Bernstein: Local Rotations
Features Shown by Local Rotation Comparison of 3I87 aligned to 3I82 by our local rotation algorithm with sequence ranges of 0 to 5 residues, showing the sines of the local angles. Absolute value of sine of local rotation angle Residue Number Andrews, Bernstein: Local Rotations
Characteristics of our Local Rotation Algorithm Fast – linear in the number of atoms Local and Global – examines a molecule residue-by-residue but the HLERP average gives the global fit Accurate – gives the best fit translation and rotation Not opaque – provides residue-by-residue analysis Sensitive to local features (see residues 40-60 in the prior graph) Potentially extensible to incommensurate structures and substructures The chain-trace of local rotation angles is a particular sensitive measure of local distortions between two generally similar structures, filling a role for structural distortions between molecules reminiscent of the role played by Ramachandran plots for distortions within a single macromolecule. Andrews, Bernstein: Local Rotations
Current Status and Future Plans The code for commensurate structures is being tested in RasMol 2.7.6 and is available in the RasMol-2.7.6 branch of the svn repository of the openrasmol project on sourceforge: http://sf.net/projects/openrasmol The basic code for vector manipulations is available in the vector project on sourceforge: http://sf.net/projects/vector The code use for nearest neighbor calculations is available in the neartree project on sourceforge: http://sf.net/projects/neartree The code for quaternion manipulations is available in the cqrlib project on sourceforge: http://sf.net/projects/cqrlib Production release is expected in Fall 2010. Andrews, Bernstein: Local Rotations
References [Diamond 1988] Diamond, R. (1988), “A note on the rotational superposition problem,” Acta Cryst. A44:2, 211 – 216. [Hamilton 1844] Hamilton, W. R. (1844), “On quaternions; or on a new system of imaginaries in Algebra,” Philosophical Magazine Series 3 25:169, 489 – 495. [Hanson 2009] Hanson, R. M (2009) “Use of quaternions in biomolecular structure analysis,” 238th ACS National Meeting, Washington, D.C., March 22-26, 2009. [Horn 1987] Horn, B. K. P. (1987), “Closed-form solution of absolute orientation using unit quaternions,” J. Optical Soc. America 4:4, 629 – 642. [Kabsch 1978] Kabsch, W. (1978), “A discussion of the solution for the best rotation to relate two sets of vectors,” Acta Cryst. A34:5, 827 – 828. [Kavraki 2007] Kavraki, L. E. (2007), “Molecular Distance Measures,” Connexions, http://cnx.org/content/m11608/1.23/, Jun 11, 2007. [Kearsley 1989] Kearsley, S. K. (1989), “On the orthogonal transformation used for structural comparisons,” Acta Cryst. A45:2, 208 – 210. [Shoemake 1985] Shoemake, K (1985), “Animating rotation with quaternion curves,” ACM SIGGRAPH Computer Graphics, 19:3, 245 – 254. [Tanaka Sawaya Yeates 2010] Tanaka, S., Sawaya, M. R., Yeates, T. O. (2010), “Structure and mechanisms of a protein-based organelle in Escherichia coli.” Science, 327:5961, 81 – 84. [Theobald 2005] Theobald, D. L. (2005), “Rapid calculation of RMSDs using a quaternion based characteristic polynomial,” Acta Cryst. A61:4, 478 – 480. [Ye Godzik 2003] Ye, Y., Godzik, A. (2003), “Flexible structure alignment by chaining aligned fragment pairs allowing twists,” Bioinformatics 19, suppl. 2., ii246 – ii255. Andrews, Bernstein: Local Rotations