Molecular Modeling: Geometry Optimization

Molecular Modeling:Geometry Optimization C372 Introduction to Cheminformatics II Kelsey Forsythe

Why Extrema? • Equilibrium structure/conformer MOST likely observed? • Once geometrically optimum structure found can calculate energy, frequencies etc. to compare with experiment • Use in other simulations (e.g. dynamics calculation) • Used in reaction rate calculations (e.g. 1/nsaddle a reaction time) • Characteristics of transition state • PES interpolation (Collins et al)

Nomenclature • PES equivalent to Born-Oppenheimer surface • Point on surface corresponds to position of nuclei • Minimum and Maximum • Local • Global • Saddle point (min and max)

Local vs. Global? Conformational Analysis (Equilibrium Conformer) A conformational analysis is global geometry optimization which yields multiple structurally stable conformational geometries (i.e. equilibrium geometries) Equilibrium Geometry An equilibrium geometry may be a local geometry optimization which finds the closest minimum for a given structure (conformer)or an equilibrium conformer • BOTH are geometry optimizations (i.e. finding wherethe potential gradient is zero) • Elocal greater than or equal to Eglobal

Terminology

Cyclohexane Global maxima Local maxima Local minima Global minimum

Geometry Optimization • Basic Scheme • Find first derivative (gradient) of potential energy • Set equal to zero • Find value of coordinate(s) which satisfy equation

No Gradients (No Functional Form for E) Bracketing Golden Section (optimal bracket fractional distance (a-b)/(a-c)is Golden Ratio) for a>b>c Parabolic Interpolation (Brent’s method) Gradients Steepest Descent Methods (1-d)

NO GRADIENTS = ZEROTH ORDER Line Search Simplex/Downhill Simplex (Useful for rough surfaces) Fletcher-Powell (Faster than simplex) Methods (n-d)W/O Gradients (Zeroth Order)

Steepest Descent Conjugate Gradient (space a N) Fletcher-Reeves Polak-Ribiere Quasi-Newton/Variable Metric (space a N2) Davidon-Fletcher-Powell Broyden-Fletcher-Goldfarb-Shanno Methods (n-d)W/Gradients (Frist Order)

Line Search

Steepest Descent

Line Search(1-d) • Steepest Descent (Gradient Descent Method)

Stochastic Tunneling Molecular Dynamics Monte Carlo Simulated Annealing Genetic Algorithm Global Multidimensional Methods

Second Order MethodsNewton’s Method • Advantages • Iterative (fast) • Better energy estimate • Disadvantages • N3 • Energy involves calculating Hessian • Assigning weights to configuration/coordinates

N-1th Order Modeling Potential energy (1-d) First Order

Modeling Potential energy (>1-d) Hessian

Newton’s Method

Newton’s Method • Equivalent to rotating Hessian (coordinate transformation, r-->r’) s.t. Hessian diagonal Gradient projection along ith eigenvector Eigenvalues from Hessian rotation/diagonalization

Second Order Methods • Advantages • Only one iteration for quadratic functions! • Efficient (relative to first -order methods) • N/N-1 = (N-1/N-2)2 (I.e. 10,100,10000 reduction in gradient) • Better energy estimate • Disadvantages • N2 storage requirements (compared to N for conjugate gradient) • N3 • Involves calculating Hessian (~10 times time for gradient calculation) • ~Hessian (pseudo-Newton methods) • Davidon-Fletcher-Powell • Broyden-Fletcher-Goldfarb-Shanno • Powell • Oft used in transition-structure searches (saddle point locator)

Second Order MethodsLevenberg-Marquardt • Far from minimum (Taylor poor!) r≠ro-b/A r=ro-b*b • Find beta s.t. move in direction of minimum • Given ro,E(ro), pick initial value of l • Find A’=(1+l)A • Find x s.t. A’x=b • Calculate E(ro+x), adjust l accordingly to reach minimum

Simplex Methods • Minimization Bounds  Polygon of N+1 vertices • Solution is a vertex of N+1-d polygon • Procedure (Downhill Simplex Method) • Begin with simplex for input coordinate values • Find lowest point on simplex • Find highest point on simplex • Reflect (x1=-xo) • If E(x1)<E(xo) then expand (x=x+l) • Else • Try internediate point • If E(xnew)<E(xo) expand • If E(xnew)>E(xo) contract

Simplex(Simplices)

Simplex Method Numerical Recipes Initial Vertices Reflection Reflection Expansion Contraction Contraction

Simplex Methods • Advantages • Gradients not required • Disadvantages • Time to minimize is long

Example • Find minimum of x2+y2=f(x,y) Line Search #1 Xn=xn-1-.1ex

Example • Find minimum of x2+y2=f(x,y) Line Search #2 Yn=yn-1-.1ey

Example • Find minimum of • x2 + xy +y2=f(x,y) Line Search #1 xn=xn-1-.1ex

Example • Find minimum of • x2 + xy +y2=f(x,y) Line Search #2 yn=yn-1-.1ey

Example (Spoiling) • Find minimum of • x2 + xy +y2=f(x,y) Line Search #3 xn=xn-1-.1ex

Global-Simulated Annealing • Crystal Cooling/Heating • Applications • Macromolecules (Conformer Searches) • Traveling Salesman Problem • Electronic Circuits

Global-Simulated Annealing • Uphill moves allowed!! • Given configuration Xi and E(Xi) • Step in direction DX • If • E(Xi+ DX)< E(Xi) - Move accepted • E(Xi+ DX)< E(Xi) then • Choose 1>Y>0 • If Accepted Metropolis et al

Global-Simulated Annealing • Uphill moves allowed!! • Implementation • Must define T – sequence • Must choose distribution of random numbers

Global-Monte Carlo Algorithms • Neumann, Ulam and Metropolis (1940s) • Fissionable material modeling • Buffon (1700s) • Needle drop – approximate pi

Global-Monte Carlo Algorithms • Approximating p • Approximating Areas/Integrals with random selection of points C B D 0 1 A

Global-Monte Carlo Algorithms • Sample Mean Integration • Consider any uniform density/distribution of points, r • Choose M points at random

Global-Monte Carlo Algorithms • Consider any uniform density/distribution of points, r

Global-Monte Carlo Algorithms • Metropolis et al • Introduced non-uniform density • Error a 1/N1/2 (N=#samplings)

Global-Genetic Algorithms • “Population” of conformations/structures • Each “parent” conformer comprised of “genes” • “Offspring” generated from mixtures of “genes” • “mutations” allowed • Most fit “offspring” kept for next “generation” • “Fitness” = low energy

Global-Rugged • Multi-Resolution • Graduated Non-Convex Smoothing

Others • Fragment Approach • Fix/Constrain part while optimizing other • Rule-Based • Proteins • Fix tertiary structure according to statistically likelihood of amino acid sequence to adopt such a structure • Homology modeling • Use geometry of similar molecules as start for aforementioned methods

Geometry Optimization(Summary) • Optimum structure gives useful information • First Derivative is Zero - At minimum/maximum • Use Second Derivative to establish minimum/maximum • As N increases so does dimensionality/complexity/beauty/difficulty

Geometry Optimization(Summary) • Method used depends on • System size • 1-d (line search, bracketing, steepest descent) • N-d local (Downhill) • W/o derivatives • Simplex • Direction set methods (Powell’s) • W/ derivatives • Conjugate gradient • Newton or variable metric methods • N-d Global • Monte Carlo • Simulated Annealing • Genetic Algoritms • Form of energy • Analytic • Not analytic

References • Computer Simulation of Liquids, Allen, M. P. and Tildesley, D. J. • Numerical Recipes:The Art of Scientific Computing Press, W. H. et. Al.

Next Time

Molecular Modeling: Geometry Optimization