360 likes | 511 Views
Tools and methods for multiscale biomolecular simulations. Celeste Sagui Department of Physics, NC State University, Raleigh NC.
E N D
Tools and methods for multiscale biomolecular simulations Celeste Sagui Department of Physics, NC State University, Raleigh NC We represent a partnership of 7 researchers located at the 3 Universities and the National Institute of Environmental and Health Sciences (NIEHS), all located within North Carolina’s Research Triangle NC State Members: Jerry Bernholc, Lubos Mitas, Christopher Roland and Celeste Sagui (PI) – Physics UNC Member: Lee Pedersen – Chemistry and NIEHS Duke Member: John Board – Computer Science NIEHS Collaborator: – Tom Darden – Structural Biology Lab
ITR Scientific Aims • explore science as to enable a set of scalable computational tools for large-scale, multiscale biomolecular simulations • multiscale methods are to range from Quantum Monte Carlo (QMC) to continuum methods • codes will be based on real-space grids, with multigrid acceleration and convergence • electrostatics will be treated in a highly efficient and accurate manner • codes will be used to solve paradigmatic biomolecular problems • codes will ultimately be distributed under the Open Source GPL license
Accurate and efficient electrostatics for large-scale biomolecular simulations (Sagui, Darden, Roland) Coupling of QMC and MD (Mitas) Thomas Fermi and DFT (Bernholc) PMEMD and AMBER 8.0 (Pedersen, Duke) Applications Some Highlights of Current Progress
Classical Molecular Dynamics Developments: Accurate Electrostatics
Accurate electrostatics is absolutely essential for meaningful biomolecular simulations (i.e., they stabilize the delicate 3-d structures, bind complexes together, represent the computational bottleneck in current simulations, etc) Key Challenges: More accurate description of electrostatics with higher-order multipoles is needed (Our solution: Wannier functions) Computationally efficient ways of simulating such systems are needed (Our solution: advanced PME/ Multigrid approaches) Accurate and Efficient Electrostatics for Large-Scale Biomolecular Simulations
Limitations of Current Modeling of Electrostatic Fields • Higher-order multipoles have not been implemented due to their overwhelming costs: • Multipoles up to hexadecapoles have 35 degrees of freedom, so that interaction matrix between them has 1225 components i.e., cost of fixed cutoff implementation is 3 orders of magnitude more than just charges alone !! • Ewald implementation grows like O(N2) • Use of cutoffs alleviate the problem: WRONG !! • much of the cost originates in the direct part • truncation leads to artifactual behavior unless cutoffs of the order of 25 Å are used
Our Solution • Implement a McMurchie-Davidson formalism for the direct part of the Ewald summation • Switch most of the calculation to reciprocal space • Implement a Particle-Mesh Ewald (PME)-based approach for single-processor machines • Implement a multigrid-based approach for parallel machines Sagui, Pedersen, and Darden, J. Chem. Phys. 120, 73 (2004)
PME-based results for 4096 water molecules Relative RMS force error: 5x10-4 ; for error 5x10-5, hexadecapole cost is 4.4 secs; 5x10-6 – cost is 5 secs With Rc=8 Å cutoff, costs is 6 times more than with PME and has a RMS error of about 0.05 Single processor Intel Xeon, 3.06 GHz, 512kB cache, 2GB memory, g77 compiler
Multigrid-based results for 4096 water molecules Relative RMS force error: 5x10-4 Single processor Intel Xeon, 3.06 GHz, 512kB cache, 2GB memory, g77 compiler
MD simulation of DNA decamer d(CCAACGTTGG)2 in a crystal environment J. Baucom et al, submitted to J. Chem. Phys. 2004
RMS deviation of crystal simulations at constant pressure with respect to crystal structure charges and induced dipoles charges only
Calculating Multipole Moments via Wannier Functions (WFs) • to partition the charge density and calculate the multipole moments, we use a WANNIER FUNCTION (WF) approach • this has several advantages: • WFs provide for a chemical and physically intuitive way of partitioning the charge (ref. Marzari and Vanderbilt, PRB 56, 12847 (1997)) • WFs are distributed in space, which allows for a more faithful representation of the electrostatic potential • no ad hoc assignment of the charge • numerically quite stable procedure Ref: C. Sagui, P. Pomorski, T. Darden and C. Roland, J. Chem. Phys. 120, 4530 (2004)
Maximally localized Wannier functions for water • water molecule has 4 WFs • 2 associated with OH bond (light blue) • 2 associated with O lone pairs (dark blue)
Electrostatic potential for single water molecule as generated by WFs ab initio m mdq mdqo mdqoh
Wannier functions for carbon dioxide • CO2 has 8 WFs • 6 associated with the CO bonds (light blue) • 2 associated with O (dark blue)
Electrostatic potential for carbon dioxide molecule as generated by WFs ab initio m mdq mdqo mdqoh
New continuous quantum Monte Carlo/molecular dynamics method • we propose a new method for coupling ab initio molecular dynamics ionic with stochastic DMC electronic steps to provide accurate DMC energies “on-the-fly” • exploits the slowness of MD evolution which enables to update the QMC sampling process very efficiently • accurate for both thermal averages and description of energies along the pathways • we have carried out the first QMC/MD simulations using both forces and energies from QMC Ref: J. Grossmann and L. Mitas, preprint 2004
Coupling of QMC and MD: Basic Idea average distance by an electron in a typical DMC time step 10-2 … 10-1 a.u. average distance made by an ion in one MD time step 10-4 … 10-3 a.u. Instead of discrete sampling of each point with a new QMC run: calculate QMC energies“on-the-fly”during the dynamic simulation ! Continuously update the DMC walkers so that they correctly represent the evolving wave function (CDMC method) Evolution of both configuration spaces is coupled: as the ionic dynamical trajectories evolve, so does the population of DMC electrons
ab initio MD Step R, (R) Compute orbital overlaps with current DMC walkers do VMC then DMC yes • Stable DMC population • How accurate is it? • Benchmark against discrete DMC Orbital swapping or rotation? no Check for node crossings compute weights Take DMC step(s) and calculate energy Stable CDMC simulation Successful CDMC Algorithm
As simulation progresses, 1-step CDMC energies begin to differ significantly from discrete DMC • Using 3 steps corrects time “lag” • 33 times more efficient than discrete sampling Thermal Averages (over 1 ps) E(discrete DMC) = -6.228(2) E(1 step continuous) = -6.220(2) E(2 steps continuous) = -6.220(2) E(3 steps continuous) = -6.226(2) E(10 steps continuous) = -6.230(2) E(20 steps continuous) = -6.228(2) • Thermal averages are converged for N≥3 • Same convergence (3 CDMC steps) observed for Si2H6 and Si5H12 CDMC: Number of DMC step needed per MD step • Use large discrete sampled runs (1000 steps each) for comparison
CDMC: Si2H6 As for SiH4, asymmetrically stretch molecule and let go Average temperature ~ 1500 K
CDMC: Si2H6 Results ForSi2H6, 3 steps appears to lead to stability as for SiH4 # steps looks like a function of dynamics rather than size Can pinpoint specific types of strain that lead to wf lag
H2O Dissociation SiH4 at 1500 K • DMC-MD and DFT-MD trajectories are in excellent agreement • DMC forces in very good agreement with DFT forces Test of quantum Monte Carlo/molecular dynamics method on water dissociation “QMC only” molecular dynamics, with no external input from DFT
Development of hydrid QM calculations: interfacing DFT with Thomas-Fermi calculations • Idea: in many biological systems, only part of the system is chemically active • Use ab initio methods for this part (real-space, multigrid-based code) • Use more approximate methods for the rest of the system (in this case Thomas Fermi approach, with frozen density for molecules, and gradient corrections) Ref: M. Hodak, W. Lu, and Bernholc, in preparation
Hybrid calculation tests • Interaction of two water molecules in hybrid calculation - Hydrogen bonding test - Molecule 1: Ab initio - Molecule 2: Thomas-Fermi Gives estimated speed-up of 500 times !!
A redesign of the AMBER SANDER program, along with a rewrite in FORTRAN 90 were undertaken with the goal of substantially improving the practicality of multi-nanosecond PME simulations (i.e., in 100,000 to 300,000 atom range) Resulting software has been released to the AMBER community in 3 phases – the new software is named PMEMD for PARTICLE MESH EWALD MOLECULAR DYNAMICS Improvements in performance and parallel scalability of AMBER MD Software Release Dates: July 2003 – PMEND 3.00 October 2003 – PMEMD 3.10 March 2004 – PMEMD 8.0 (part of AMBER 8.0) PMEMD is now the primary high performance MD modeling tool in AMBER !!
Representative Benchmark Results for the Factor IX constant pressure system from Dr. Lalith Perera, a solvated system with a total of 90906 atoms. The time step was 1.5 fs, the direct force cutoff was 8.0 angstrom, all simulations used PME. The runs were done on the IBM 1.3 GHz p690 Regatta at the Edinburgh Parallel Computing Centre. #procs PMEMD 8 PMEMD 3.1 PMEMD 3.0 Sander 8 Sander 7 Sander 6 psec/day psec/day psec/day psec/day psec/day psec/day 8 nd 346 353 nd 233 182 16 672 607 594 nd 279 258 32 1125 1035 929 nd 306 297 64 1975 1770 1127 369 318 339 96 2743 2304 nd nd nd nd 112 2945 2631 nd nd nd nd 128 2516* 2864 nd 339 nd nd * Performance falloff observed here. Max performance obtained at higher processor count was 3600 psec/day, but required using only 4 of the 8 cpu's on each 8 cpu multi-chip module of the SP4.
Software References Maximum throughput obtainable for SP4 is up an order of magnitude for PMEMD 8, than any other version ofSANDER Software Publication References: R.E. Duke and L.G. Pedersen, PMEMD 3.0 (2003) R.E. Duke and L.G. Pedersen, PMEMD 3.1 (2003) D.A. Case et al, AMBER 8.0 (2004)
QM/MM studies of enzymatic sulfate transfer in the heparan sulfate precursor QM/MM studies of enzymatic sulfate transfer in estrogen sulfation PMEMD study of the mammalian P450 enzyme and the ternary blood coagulation complex tissue factors Protein folding study on ionic domains of the coagulation protein protothrombin Solvation and deprotonation of formic acid Crystallographic studies of DNA Binding of vancomycin and teicoplanin antibiotics to bacterial cell wall termini Structure and function of serine proteases QM/MM study of role of Mg ions in the mechanism of DNA polymerase Some Applications …
Mixed Quantum and Molecular Mechanics Simulations of Sulfuryl Transfer Reaction Catalyzed by Human Estrogen Sulfotransferase P. Lin and L. Pedersen K47 PAPS E2 H107
Mixed Quantum and Molecular Mechanics Simulations of Sulfuryl Transfer Reaction Catalyzed by Human Estrogen Sulfotransferase P. Lin and L. Pedersen • estrogen is one of the most important hormones found in the human body • it is extremely important that the body regulate estrogen, being able to both turn it on and off • the deactivation of estrogen takes place by means of transfering a sulfate group to the hormone • the details of this important reaction were investigated by means of a mixed quantum and classical molecular dynamics simulation, as shown in the movie • movie shows how the sulfate group gets placed on the estrogen
Scientific aims are to produce a set of scalable and portable computational tools for multiscale biomolecular calculations Considerable progress in number of aspects: Development of accurate and efficient methods for treatment of long-range electrostatic forces Development of QMC and MD methods Development of DFT and TF interface PMEMD and AMBER 8.0 Applications Summary