Structure Refinement

Structure Refinement BCHM 5984 September 7, 2009

Methods of 3-D Structure Prediction • De novo structure prediction • Derived completely from sequence • Comparative modeling • Fold recognition / Threading (when no homologs exist) • Homology modeling (when clear homologs exist)

Steps in Homology Modeling Identify homologues Determine sequence identity Align sequences Identify conserved / non-conserved regions Generate model for conserved regions Generate model for non-conserved regions Build sidechains Evaluate and refine

Understanding the Problem Leach, AR (2001) “Molecular Modelling Principles and Applications”. 2nd Ed. Prentice Hall Publishers.

Proteins are Complicated • Many more atoms • Must consider: • Ideal bond lengths, bond angles, dihedrals • Electrostatic interaction (hydrogen bonds, ionic interactions) and van der Waals interactions

How to Solve the Problem • Given a function f which depends on one or more independent variables x1, x2, …, xi, find the values of those variables where fhas a minimum value. • At a minimum point, the first derivative of the function with respect to each of the variables is zero and the second derivatives are all positive: • The function f is the potential energy • The variables xi are the atomic Cartesian coordinates • Change the position of the coordinates (xi) until we find the position with the smallest potential energy

Examples of the Functions E = Ecovalent + Enoncovalent can be further expanded to: Ecovalent = Ebond + Eangle + Edihedral Enoncovalent = Eelectrostatic + Evdw

Examples of the Functions Enoncovalent = Eelectrostatic + Evdw Coulombic Potential Lennard-Jones Potential

Examples of the Functions Ecovalent = Ebond + Eangle + Edihedral Bond Stretching Potential Harmonic Angle Potential Dihedral Potential

What You Need to Start • Cartesian coordinates of your model • (x, y, z) for every atom = 3N variables where N is the number of atoms • Energy minimizer program • Knows potential energy functions for minimization • Knows ideal bond lengths, bond angles, etc. for all atomic interactions, covalent and non-covalent (also called force fields)

Energy Minimization Methods • Non-derivative methods • Simplex • First-order derivative methods • Steepest descents • Conjugate gradients • Second-order derivative methods • Newton-Raphson • Quasi-Newton methods • Davidson-Fletcher-Powell (DFP) • Broyden-Fletcher-Goldfarb-Shanno (BFGS)

Simplex Method • Moves around like an “amoeba”

Non-Derivative Methods • Advantages: • Works well when starting configuration is very high in potential energy • Disadvantages: • Surprisingly slow (calculations are fast, but it takes many iterations) • Not good for large biomolecules

Steepest Descents Method 1.) Evaluate the sum of all forces on the system (first derivative of potential energy functions) 2.) Move in the direction of the force until potential energy stops decreasing 3.) Turn 90° and return to step 2 sx = -gx / |gx| s = step direction g = gradient direction x = coordinates of system The next step is orthogonal: gxgx-1 = 0

SD: When to Turn • Line Search • Find three points along a line where the middle point is less than the other two points • Calculates a function for the three points and determines the minimum • The minimum becomes the middle point, and repeat • Arbitrary Step • Try a small step size to see that potential energy decreases • Iteratively increase step size until potential energy is increased • Multiply the final step size by 0.5

SD: When to Stop • After a predefined energy minimum has been reached • For example, < 1.0 kJ / mol • After a predefined number of steps • For example, after 1000 orthogonal steps

SD: Searching Problem • Does not work well in (relatively) flat energy wells • Takes too many steps / too long to finish

Conjugate Gradients Method 1.) Evaluate the sum of all forces on the system (first derivative of potential energy functions) 2.) Move in the direction of the force until potential energy stops decreasing 3.) Return to step 1 Red line = Conjugate gradients Green line = Steepest descents

Steepest Descents vs. Conjugate Gradients • Steepest descents: • Stable and rigorous • Generally slower and takes more steps than CG in flat wells • Can take bigger steps and finish faster in steep wells • Conjugate gradients: • Slower in the beginning, but can be faster overall (takes fewer steps) in flat wells • Less stable than SD (may need restarting) • Both methods (ideally) converge to the same local energy minimum

Second-Order Derivative Methods • Use first derivatives to see which way the gradient flows • Use second derivatives to see changes in the way the gradient flows • Tries to predict the best spot to “jump” to • Newton-Raphson method: xn = current position xn+1 = next position f’(xn) = first derivative of energy function f'’(xn) = second derivative of energy function

Newton-Raphson can be Slow • A Hessian matrix is a matrix of second-order derivatives of a function • Must be calculated in each step for Newton-Raphson method

The BFGS Assumption • Calculating second-order derivatives is hard and time consuming • Never actually calculates a Hessian matrix, just estimates it as it goes along • Estimated by looking at successive gradients • Not technically a “true” second-order derivative method

Second-Order Derivative Methods • Advantages • Takes the fewest steps • Fastest (for small molecules) • Disadvantages • For big systems, can require too much memory • Best suited for small molecules Red line = BFGS method Green line = Conjugate gradients

Choosing an EM Method • Depends on: • Storage / computational capabilities • Number of atoms in the system • When working with proteins, always steepest descents or conjugate gradients

Inherent Problem of EM • Only finds local minima • No method available can find the global minimum from any starting point

Performing Energy Minimization • Links • Dundee PRODRG2 Server (http://davapc1.bioch.dundee.ac.uk/prodrg/) • Swiss-PDBViewer (http://ca.expasy.org/spdbv/text/energy.htm) • GROMACS (http://www.gromacs.org/) • NAMD (http://www.ks.uiuc.edu/Research/namd/) • AMBER (http://ambermd.org/) • CHARMM (http://www.charmm.org/) • Methods of EM in GROMACS • Steepest descents • Conjugate gradients • L-BFGS (limited-memory Broyden-Fletcher-Goldfarb-Shanno quasi-Newtonian minimizer)

Discuss Homework 2

Structure Refinement

Structure Refinement

Presentation Transcript

Project Refinement

Modular Refinement

High Accuracy Scoring Functions for Computational Protein Structure Refinement

Modular Refinement

Schema Refinement

Behavioral Refinement

Rietveld Refinement

Refinement parameters

Refinement/Improvement of the UACS Code Structure

Convergence Refinement

Refinement of a pdb-structure and Convert

Refinement Planing

Structure Solution and Basic Refinement

Schema Refinement

Storage Refinement

Macromolecular structure refinement

Refinement parameters

Structure Refinement in First Order Conditional Influence Language

Refinement

Refinement of a pdb-structure and Convert

Refinement Planing

Stepwise Refinement