550 likes | 562 Views
Explore projective integration, rare events, reverse integration, and free energy surface reconstruction in complex systems. Apply to micelle formation and alanine dipeptide dynamics. Methods include MD simulations and Fokker-Planck equation.
E N D
IMA, Minneapolis, January 2008 Computational Experiments in coarse graining atomistic simulations (Equation-free-& Variable Free- Framework For Complex/Multiscale Systems) I.G. Kevrekidis -C. W. Gear, G. Hummer, R. Coifman and several other good people- Department of Chemical Engineering, PACM & Mathematics Princeton University, Princeton, NJ 08544
SIAM– July, 2004 Clustering and stirring in a plankton model Young, Roberts and Stuhne, Nature 2001
Simulation Method • Random (equal) birth and death, probability: l = m. • Brownian motion. • Advective stirring. (j, qare random phases) • IC: 20000 particles randomly placed in 1*1 box • Analytical Equation for G(r):
RESTRICTION - a many-one mapping from a high-dimensional description (such as a collection of particles in Monte Carlo simulations) to a low-dimensional description - such as a finite element approximation to a distribution of the particles. LIFTING - a one-many mapping from low- to high-dimensional descriptions. We do the step-by-step simulation in the high-dimensional description. We do the macroscopic tasks in the low-dimensional description.
So, the main points: • You have a “microscopic code” • Somebody tells you what are good coarse variable(s) • Somebody tells you what KIND of equation this variable satisfies (deterministic, stochastic…) but NOT what the equation looks like. • Then you can use the IDEA that such an equation exists and closes to accelerate the simulation/ extraction of information. • KEY POINT – make micro states consistent with macro states
Application to Micelle Formation • Hydrophobic tail (T) • Hydrophilic head (H)
Lattice Model (Larson et al., 1985) • Surfactant = chain of H and T beads (H4T4) • No explicit solvent • Hydrophobic-hydrophilic interactions • modeled as direct interaction between • H and T beads • Pairwise interactions with nearest sites:
Snapshot of Micellar System T = 7.0, µ = - 47.40
Kinetic Approach to Rare Events(Hummer and Kevrekidis, JCP 118, 10762 (2003)) • Evolution of coarse variables is slow • micelle birth/death are rare events • Reconstruction of free energy surface: • long equilibrium simulation • series of short nonequilibrium simulations
Reconstruction of Free Energy Surface Obtain G() and () from short-scale nonequilibrium simulations
Results: Micelle Destruction Rate Kramers’ formula • Result of nonequilibrium simulation: k = 5.58 x 10-9 • Equilibrium result: k = 7.70 x 10-9 • CPU time required: less than 7% of equilibrium simulation • Extension to multi-dimensional systems (Hummer and Kevrekidis, 2003) • Chapman-Kolmogorov equation
Reverse Projective Integration – a sequence of outer integration steps backward; based on forward steps + estimation 1 2 3 Reverse Integration: a little forward, and then a lot backward ! We are studying the accuracy and stability of these methods
Details of Multidimensional Dynamics Small clusters: 2d dynamics Larger clusters: 1d dynamics
w/ Gerhard Hummer, NIDDK / J.Chem.Phys. 03 Alanine Dipeptide In 700 tip3p waters The waters The dipeptide and the Ramachandran plot
Start with constrained MD • Let 50 configurations free • Estimate d/dt of average • Perform projective step
Alanine Dipeptide Energy Landscape G.Hummer, I.G.Kevrekidis. Coarse molecular dynamics of a peptide fragment: Free energy, kinetics, and long-time dynamics computations J.Chem. Phys. 118 (2003).
Protocols for Coarse MD (CMD) using Reverse Ring Integration MD Simulations run FORWARD in Time Initialize ring nodes Step Ring BACKWARD in Energy
Fokker-Planck Equation (2D)for distribution P(x1,x2) Probability Currents 2D FPE: Diffusion Coefficients Drift Coefficient
Drift and Diffusion Coefficients MD Simulations run FORWARD in Time Ring node ICs multiple replicas per node compute drift (v) and diffusion (D) coefficients P Dihedral angles use (v, D) estimates to check for existence of potential () and compute potential values at each node
Ring Stepping in Generalized Potential Ring at + Ring at Goal: Compute potential associated with ring nodes (ring “height” above x1-x2 plane)
( D = scalar) Case I: Diffusion matrix is proportional to unit matrix: Probability Current: Probability Current vanishes Potential Conditions I Drift Coefficient Potential Conditions Conditions for existence of generalized potential : Path Path
Short bursts of MD simulation initialized at each node in the ring Data analysis of forward in time MD provides gradient information small step FORWARD in time large step BACKWARD in energy CMD Exploration of Alanine Dipeptideusing drift coefficients Ring nodes Reverse ring integration stagnates at saddle points on coarse free energy landscape ONLY drifts estimated here ring step size is scaled by unknown diffusivity D nodal “heights” unknown stepsize ~
UNFOLDED TRANSITION STATE FOLDED Fraction non-native contacts formed Fraction native contacts formed Src homology 3 (SH3) domain Protein modeled using off-lattice C representation associated with simplified minimally frustrated Hamiltonian Small Modular Domain 55-75 amino acids long blue (N-terminus) to red (C-terminus) n-Src Diverging Turn C-terminus Distal Loop Protein folding intrinsically low-dimensional “Collective” (coarse, slow) coordinates fully describe long time system dynamics N-terminus Characteristic fold consisting of five or six β-strands arranged as two tightly packed anti-parallel β sheets P. Das, S. Matysiak, and C. Clementi, Proc. Natl. Acad. Sci. USA 102, 10141 (2005).
Reverse Ring Integration and MDCoarse MD (CMD) Transition state identification using CMD
So, again, the main points • Somebody needs to tell you what the coarse variables are • And what TYPE of equation they satisfy • And then you can use this information to bias the simulations “intelligently” accelerating the extraction of information (also need hypothesis testing).
and now for something completely different: Little stars ! (well…. think fishes)
Fish Schooling Models Initial State Position, Direction, Speed INFORMED UNINFORMED Compute Desired Direction Zone of Attraction Rij< Zone of Deflection Rij< Normalize Update Direction for Informed Individuals ONLY Couzin, Krause, Franks & Levin (2005) Nature (433) 513 Update Positions
INFORMED DIRN STICK STATES STUCK ~ typically around rxn coordinate value of about 0.5 INFORMED individual close to front of group away from centroid
INFORMED DIRN SLIP STATES SLIP ~ wider range of rxn coordinate values for slip 00.35 INFORMED individual close to group centroid
Effective Fokker-Planck Equation FPE: Diffusion Coefficient Drift Coefficient Potential:
Coarse Free Energy Calculation Estimate Drift and Diffusion coefficients numerically from simulation “bursts” X0 X0 X0 Improved estimates using Maximum Likelihood Estimation (MLE) Y. Aït-Sahalia. Maximum Likelihood Estimation of Discretely Sampled Diffusions: A Close-Form Approximation Approach. Econometrica 70 (2002). Kopelevich, Panagiotopoulos & Kevrekidis J Chem Phys 122 (2005) Hummer & Kevrekidis J Chem Phys 118 (2003)
Energy Landscape – Fish Swarming Problem CENTROID Informed DIRN SLIP SLIP STUCK
Using the computer to select variable Rationale: Straight Line Distance between locations NOT representative of actual transition difficulty\distance Lake Carnegie, Princeton, NJ
Lake Carnegie, Princeton, NJ Using the computer to select variable Rationale: Straight Line Distance Curved Transition Distance Actual transition difficulty represented by curved path
Using the computer to select good variables Rationale: Straight Line DistanceIS representative of actual transition difficulty\distance in small LOCAL patches Patch size related to problem “geography” Lake Carnegie, Princeton, NJ
Z X Y Using the computer to select variable Rationale: 3D Dataset with 2D manifold Selected Datapoint Euclidean Distance Euclidean distance in input space may be weak indicator of INTRINSIC similarity of datapoints Geodesic distance is good for this dataset
Dataset as Weighted Graph edges 3 W23 weights 2 W12 1 vertices parameter t
Unequal separation (Euclidean distance) between IC ( ) and limits of random walk ( , ) Multiple random walks through simulation data initialized at