300 likes | 307 Views
Structure Refinement. BCHM 5984 September 7, 2009. Methods of 3-D Structure Prediction. De novo structure prediction Derived completely from sequence Comparative modeling Fold recognition / Threading (when no homologs exist) Homology modeling (when clear homologs exist).
E N D
Structure Refinement BCHM 5984 September 7, 2009
Methods of 3-D Structure Prediction • De novo structure prediction • Derived completely from sequence • Comparative modeling • Fold recognition / Threading (when no homologs exist) • Homology modeling (when clear homologs exist)
Steps in Homology Modeling Identify homologues Determine sequence identity Align sequences Identify conserved / non-conserved regions Generate model for conserved regions Generate model for non-conserved regions Build sidechains Evaluate and refine
Understanding the Problem Leach, AR (2001) “Molecular Modelling Principles and Applications”. 2nd Ed. Prentice Hall Publishers.
Proteins are Complicated • Many more atoms • Must consider: • Ideal bond lengths, bond angles, dihedrals • Electrostatic interaction (hydrogen bonds, ionic interactions) and van der Waals interactions
How to Solve the Problem • Given a function f which depends on one or more independent variables x1, x2, …, xi, find the values of those variables where fhas a minimum value. • At a minimum point, the first derivative of the function with respect to each of the variables is zero and the second derivatives are all positive: • The function f is the potential energy • The variables xi are the atomic Cartesian coordinates • Change the position of the coordinates (xi) until we find the position with the smallest potential energy
Examples of the Functions E = Ecovalent + Enoncovalent can be further expanded to: Ecovalent = Ebond + Eangle + Edihedral Enoncovalent = Eelectrostatic + Evdw
Examples of the Functions Enoncovalent = Eelectrostatic + Evdw Coulombic Potential Lennard-Jones Potential
Examples of the Functions Ecovalent = Ebond + Eangle + Edihedral Bond Stretching Potential Harmonic Angle Potential Dihedral Potential
What You Need to Start • Cartesian coordinates of your model • (x, y, z) for every atom = 3N variables where N is the number of atoms • Energy minimizer program • Knows potential energy functions for minimization • Knows ideal bond lengths, bond angles, etc. for all atomic interactions, covalent and non-covalent (also called force fields)
Energy Minimization Methods • Non-derivative methods • Simplex • First-order derivative methods • Steepest descents • Conjugate gradients • Second-order derivative methods • Newton-Raphson • Quasi-Newton methods • Davidson-Fletcher-Powell (DFP) • Broyden-Fletcher-Goldfarb-Shanno (BFGS)
Energy Minimization Methods • Non-derivative methods • Simplex • First-order derivative methods • Steepest descents • Conjugate gradients • Second-order derivative methods • Newton-Raphson • Quasi-Newton methods • Davidson-Fletcher-Powell (DFP) • Broyden-Fletcher-Goldfarb-Shanno (BFGS)
Simplex Method • Moves around like an “amoeba”
Non-Derivative Methods • Advantages: • Works well when starting configuration is very high in potential energy • Disadvantages: • Surprisingly slow (calculations are fast, but it takes many iterations) • Not good for large biomolecules
Energy Minimization Methods • Non-derivative methods • Simplex • First-order derivative methods • Steepest descents • Conjugate gradients • Second-order derivative methods • Newton-Raphson • Quasi-Newton methods • Davidson-Fletcher-Powell (DFP) • Broyden-Fletcher-Goldfarb-Shanno (BFGS)
Steepest Descents Method 1.) Evaluate the sum of all forces on the system (first derivative of potential energy functions) 2.) Move in the direction of the force until potential energy stops decreasing 3.) Turn 90° and return to step 2 sx = -gx / |gx| s = step direction g = gradient direction x = coordinates of system The next step is orthogonal: gxgx-1 = 0
SD: When to Turn • Line Search • Find three points along a line where the middle point is less than the other two points • Calculates a function for the three points and determines the minimum • The minimum becomes the middle point, and repeat • Arbitrary Step • Try a small step size to see that potential energy decreases • Iteratively increase step size until potential energy is increased • Multiply the final step size by 0.5
SD: When to Stop • After a predefined energy minimum has been reached • For example, < 1.0 kJ / mol • After a predefined number of steps • For example, after 1000 orthogonal steps
SD: Searching Problem • Does not work well in (relatively) flat energy wells • Takes too many steps / too long to finish
Conjugate Gradients Method 1.) Evaluate the sum of all forces on the system (first derivative of potential energy functions) 2.) Move in the direction of the force until potential energy stops decreasing 3.) Return to step 1 Red line = Conjugate gradients Green line = Steepest descents
Steepest Descents vs. Conjugate Gradients • Steepest descents: • Stable and rigorous • Generally slower and takes more steps than CG in flat wells • Can take bigger steps and finish faster in steep wells • Conjugate gradients: • Slower in the beginning, but can be faster overall (takes fewer steps) in flat wells • Less stable than SD (may need restarting) • Both methods (ideally) converge to the same local energy minimum
Energy Minimization Methods • Non-derivative methods • Simplex • First-order derivative methods • Steepest descents • Conjugate gradients • Second-order derivative methods • Newton-Raphson • Quasi-Newton methods • Davidson-Fletcher-Powell (DFP) • Broyden-Fletcher-Goldfarb-Shanno (BFGS)
Second-Order Derivative Methods • Use first derivatives to see which way the gradient flows • Use second derivatives to see changes in the way the gradient flows • Tries to predict the best spot to “jump” to • Newton-Raphson method: xn = current position xn+1 = next position f’(xn) = first derivative of energy function f'’(xn) = second derivative of energy function
Newton-Raphson can be Slow • A Hessian matrix is a matrix of second-order derivatives of a function • Must be calculated in each step for Newton-Raphson method
The BFGS Assumption • Calculating second-order derivatives is hard and time consuming • Never actually calculates a Hessian matrix, just estimates it as it goes along • Estimated by looking at successive gradients • Not technically a “true” second-order derivative method
Second-Order Derivative Methods • Advantages • Takes the fewest steps • Fastest (for small molecules) • Disadvantages • For big systems, can require too much memory • Best suited for small molecules Red line = BFGS method Green line = Conjugate gradients
Choosing an EM Method • Depends on: • Storage / computational capabilities • Number of atoms in the system • When working with proteins, always steepest descents or conjugate gradients
Inherent Problem of EM • Only finds local minima • No method available can find the global minimum from any starting point
Performing Energy Minimization • Links • Dundee PRODRG2 Server (http://davapc1.bioch.dundee.ac.uk/prodrg/) • Swiss-PDBViewer (http://ca.expasy.org/spdbv/text/energy.htm) • GROMACS (http://www.gromacs.org/) • NAMD (http://www.ks.uiuc.edu/Research/namd/) • AMBER (http://ambermd.org/) • CHARMM (http://www.charmm.org/) • Methods of EM in GROMACS • Steepest descents • Conjugate gradients • L-BFGS (limited-memory Broyden-Fletcher-Goldfarb-Shanno quasi-Newtonian minimizer)