410 likes | 476 Views
Coordinates and Pathways in MM and QM/MM modeling. Haiyan Liu School of Life Sciences, University of Science and Technology of China.
E N D
Coordinates and Pathways in MM and QM/MM modeling Haiyan Liu School of Life Sciences, University of Science and Technology of China
In MM and QM/MM modeling of biomolecules,we often aim at understanding mechanisms of processes, many of which too slow to be investigated by direct simulations. Examples To study protein functions: Possible chemical/conformational (sub)states ? Mechanism of transitions between them? To study protein/peptide folding: any preferred “pathways” or “order of events”? Roles of topologies and sequences?
Two (?) basic causes for macroscopic slowness • Need to overcome major enthalpic barriers (e.g., chemical reactions…) • Need to “zoom” into a very limited region in the conformational space (e.g., protein folding, binding…)
B state A state time Among major obstacles in simulations • Sampling (in)efficiency Transition time Waiting time
Two basic types of approaches A. Connecting known terminal states A1 “forced” barrier crossing Umbrella sampling, Targeting or Steered MD, Drawbacks: projecting a many-dimensional system onto a few pre-assumed reaction coordinates A projected representation of the many-dimensional problem
Problem associated with Improper projection Environ. Degrees of Freedom Reaction coordinates (Rc) Restrained optimization: discontinuous environment Potential of mean forces along Rc: sampling minima but not transition states
A2 Chain of states or path optimization methods Discrete representation of pathways (a pathway is represented by a chain of replicas) “enforced” continuity of the pathway A parametric representation of the many-dimensional problem
B. Introducing more frequent transitions between states Accelerate minimum-escaping (elevated temperature simulations, conformational flooding or local elevation, parallel replica simulations, potential energy function deformation) The key is to avoid over-expanding the accessible conformational space.
Accelerated sampling approaches • Potential energy-based v.s. kinetic energy-based • Equilibrium v.s. non-equilibrium sampling • Degree of freedom (DOF)-specific and degree of freedom-nonspecific • delocalized (collective) DOF or local DOF
coordinates (or order parameters) are essential,provided that we have good enough energy model… • “forced” transitions and free energy surfaces: which coordinates to project onto? • Chain of states method: enforcing continuity on which coordinates? • Accelerated sampling: which coordinates to apply the bias?
Examples • Local elevation • Potential energy-based, non-equilibrium,DOF-specific, local DOFs, • Conformational flooding • Potential energy-based, non-equilibrium, DOF-specific, delocalized DOFs • Temperature REMD • Kinetic energy-based, equilibrium, DOF-non specific • Amplified collective motion (ACM) model • Kinetic energy-based, non-equilibrium, DOF-specific, delocalized DOFs …
Our works in recent years Amplified collective motion MD simulation (B) Obtaining minimum energy paths in QM/MM modeling of enzymatic reactions with a modified nudged elastic band method (A2) coarsely-guided sampling of folding trajectories of a small protein domain in implicit solvent (A1) Hamiltonian replica change simulation with free energy-surface-derived umbrella potentials (B)
Accelerate conformation search by Amplifying collective motions Collective coordinates have been used in the analysis of protein dynamics for a long time: Normal mode analysis Principal component (or essential dynamics) analysis of conformational sets Coarse grained elastic net work models.
Several important observations from such studies: Protein motions (e.g. atomic positional fluctuations) are dominated by a very small number of slow modes. These slow modes often correspond to functional motions. The low frequency space is insensitive to details of models
Zhang et al Biophys. J., 2003, 84, 3583 He,et al J. Chem. Phys. 2003, 119, 4005.
Derive low frequency collective modes using the coarse-grained elastic network model no need for exact minimum but use only a single conformation; low frequency modes can be updated on the fly in a simulation; correctly captures the low frequency modes along the “valley” on the energy surface (for compact structures)
Advantages Sampling in conformational space extended along “valleys” of the energy landscape. No “melting” of local structures. Lower frequency subspace updated on the fly. No deformation of potential energy surface. No pre-definition of “path” or “reaction coordinates”.
Drawbacks • Functionally important motions may not correspond to the slowest few modes • Does not correspond to any equilibrium ensemble. Difficult to be quantitative
Test systems Inter-domain motions of T4 lysozyme in explicit solvent. Folding of a S-peptide analog (in implicit solvent described by a Generalized-Born model)
Bacteriophage T4 lysozyme X-ray structures env 2 (0.13 nm) env 1 (0.40 nm) First three modes of the coarse grained model: 80% of the variations
ACM-MD produces larger fluctuations Atom position RMS fluctuations in MD (300 K dashed line) and ACM-MD (Three slowest modes: 800 K, other modes: 300 K) Zhang et al Biophys. J., 2003, 84, 3583
ACM-MD sampled larger variations in the two PCA direction. Projection on the two largest principal components of the crystal structures(dots), MD trajectory (red), and ACM-MD trajectory(blue). Zhang et al Biophys. J., 2003, 84, 3583
ACM-MD and normal MD are similar in intra-domain motions Number of residues In secondary structures RMSD from native structure N-term domain C-term domain Solid: MD Dotted: ACM-MD Zhang et al Biophys. J., 2003, 84, 3583
Folding of a S-peptide analog a b MD ACM-MD ACM-MD MD
ACM-MD refolds the peptide while normal MD cannot MD,start from native ACM-md,start from native MD,start from unfolded ACM-md,start from unfolded Solid: RMS deviation from unfolded as functions of time Dotted: RMSD from native as functions of time Zhang et al Biophys. J., 2003, 84, 3583
The ACM method: Collective DOF; kinetic energy based; improves sampling; non-equilibrium ensemble thus difficult to go quantitative Application by another group: Biochemistry , 2006, 45 (51) : 15269-15278
Chain of states method in path optimization The nudged elastic band method Each replica moves to minimize the force perpendicular to the path. and to maintain even distribution of the replicas along the path • Force: Reaction coordinate driven
Advantages: No pre-assumed reaction coordinate. Suits for parallel computations Problems for enzymatic reactions Enzyme systems contain many floppy degrees of freedom. Impractically small radius of convergence.
Soft spectator degree of freedom Y spoils the NEB calculation Xie et al J.Chem. Phys., 2004, 120,8039.
Heuristic solution: Exclude spectator degrees of freedom Use a set of inter-atomic distances (chemical subspace) f(d) Multiple step reactions d Xie et al J.Chem. Phys., 2004, 120,8039.
The acylation step of type A beta-lactamase
Energy decomposition TS stabilization Xie et al J.Chem. Phys., 2004, 120,8039.
An application Metal-preferences of metallo-proteases E-coli peptide deformylase: prefers Fe++ over Zn++ Thermolysin: prefers Zn++ Dong et al, J.Phys.Chem. B, 2008(112) ,10280-10290.
comparative modeling of Zn-TLN and Zn-PDF using NEB Dong et al, J.Phys.Chem. B, 2008(112) ,10280-10290.
ab initio QM/MM Potential energy surfaces reproduce metal preferences
Dong et al, J.Phys.Chem. B, 2008(112),10280-10290.
Summary • Some general discussions on “coordinates”-based or DOF specific approaches to accelerate the modeling of slow processes • Two particular types of approaches • Amplified collective motions • NEB adapted for the simulations of enzyme reactions • An example showing comparative modeling provides biochemical insights
Acknowledgements Zhiyong Zhang, Jianbin He (ACM) Li Xie (adapted NEB) Minghui Dong (PDF and TLN) All former and current group members Adapted NEB: Weitao Yang and group Funding: CAS, NSCFC