290 likes | 304 Views
Discovering reaction pathways via an innovative hypersphere search method in organic chemistry. Explore potential energy surfaces to prioritize pathways without detailed system knowledge. Utilizes normal coordinates for efficient energy calculations.
E N D
Scaled Hypersphere Search Method By Timm Reumann Prof. Dr. K. K. Baldridge Organic Chemistry Institute, University of Zürich, Switzerland
Potentail Energy (Hyper-) Surface (PES) PES: Potential Energy Surface
Potential Energy (Hyper-) Surface (PES) Energy TS2 TS1 GS1 GS0 GS2 Reaction Coordinate • Molecule in Ground State (GS): Stable Equilibrium of Forces • Molecule in Transition State (TS): Instable Equilibrium of Forces, • 2directions downwards • Reaction Path: Valley on PES connecting 2 GSs
Chemical Bond and Electron Clouds Shape of Molecule => Energy? • Electrostatic Interaction between Electrons (--) => Repulsion • Electrostatic Interaction between Atomic Cores (++) => Repulsion • Electrostatic Interaction between Electrons and Atomic Cores (-+) => Repulsion • Overlap Interaction => Attraction (usually) 1. Bond Stretching: 2. Angle Bending:
Motivation: • Finding all (relevant) reaction pathways for a given molecule (e.g. catalyst) • e.g.: • modifying a catalyst to favor a specific reaction • unfortunately also a side reaction favored now • survey over • Prioritizing reaction pathways • Without Specific Knowledge about given System (e.g. Initial Guess)
But: • Initial guess of TS or Pathway or further GS still required by established methods • (e.g. nudged elastic band and string methods1) • OR: systematic exploration of all possible reaction pathways starting at one GS • 1. No general formula for energy of a molecule available (esp. at TS) • Iteration on single electron clouds and their total charge distribution • electron clouds of minimal energy and energy itself (Self-Consistent-Field-Proc, SCF) • 2. Calculations of 1st and 2nd derivatives of energy on atoms to follow reaction path • requires converged electron clouds => SCF-Procedure beforehand • (esp. 2nd derivates computationally expensive) 3. Changing one coordinate affects dependency of energy on another coordinate (Energy coupling between coordinates)
Harmonic Approximation of Energy (computationally cheap) Instead of 1st and 2nd Derivatives • Describing Chemical Bond as a Spring: • F = -kf (x – x0) => Eharm = -0.5 kf (x – x0)2 0 x0 x • Eharm ∞ for x ∞ • unbreakable bonds • no chemical reaction Eharm E Dissociation ESCF ΔE Since Chemical Bonds ARE breakable: GS x TS GS • Downwards Deviation (ΔE) of “true” energy (ESCF) from harmonic energy (Eharm) • potential reaction path
Exploration of PES in Normal Coordinates II Instead of Real Space Coordinates Energy Coupling between Real Space Coordinates: Etot = E(X1, …, Y2, …, Z3) + E(X2, …, Y3 ) + … Etot = E(L1, ϕ1) + E(ϕ1, L1,L2) + … + E(φ1, ϕ1, L1, L2, L3, …) + … L : Bond Length ; ϕ, φ : Bending, Torsion Angle • For example: • Stretching a bond • Weaker bond • Bending that bond now easier Real Space Coord. 1 • Nd SCF-Energy Calculations required Real Space Coord. 2 O H H
Exploration of PES in Normal Coordinates II Instead of Real Space Coordinates O H H ei n1 Decoupling in Normal Coordinates Of Vibrational Modes: 4080 n2 1960 • smart combinations of basic atomic displacements • compensating each others’ influences on energy terms • each mode represented by basis vector ni in • normal coordinate space n3 4080 Normal Coord. 1 Etot ≈ e1 n12 + e2 n22 + … + en nn2 • Only Nd Sample Points Necessary • at leastaround Ground State Normal Coord. 2 Maeda, S.; Ohno, K.; Chemical Physics Letters, 2003, 381, 177
Overall Procedure Eharm E Dissociation ESCF ΔE Really New GS Molecules GS q GS TS Deformed Molecules on Reaction Paths Comparison with old GS Molecules GS Molecule G Vibrational Modes + Eigenvalues Radius of Hypersphere: Rk = Rk-1 + ΔR No Exploration of Spherical PES Points of locally lowest ΔE Transition States Yes Optimization to TS G Downhill Walks G TS region reached? G Checking dE/dx, d2E/dx2 New GS Molecules G Done with GAMESS-US
Exploration of Spherical PES Scaled Normal Coordinate Space Real Coordinate Space Radius Rk Vibrational Modes vi Energy Values ei GS Geometry xGS Scaling qi = vi ei0.5 Q = {q1, …, qn}T Generation of Grid Points on Hypersphere e.g.: xi = Q-1 n + xGS Deformations of GS xi n1 e.g.: e.g.: n3 X Y Z O 0.09 -1.02 0.00 H -2.46 -2.70 0.00 H 5.66 -2.60 0.00 n2
Scaled Normal Coordinate Space Real Coordinate Space Points with ΔE-Values SCF Runs Eharm = 0.5 Rk2 e.g.: ESCF ΔE = ESCF - Eharm ΔE > 0 ΔE < 0 Interpolation of ΔE xi = Q-1 n + x0 Deformed Molecules xi on Reaction Paths Local Minima of ΔE n1 n3 n2
Everything much Easier Now Or??? • Number of Sample Point for Molecules * • Vibrational Modes calculated at the GS only valid for the GS • Going away from GS (deforming molecule) • Vibrational Modes more and more invalid • increasing energy coupling between normal coordinates • Complete SCF-energy calculation for each point necessary • Computational Time: few seconds for simple molecules (e.g. water) • few hours for big ones (e.g. Organic Platinum-Komplex) * Maeda, S.; Ohno, K.; Chemical Physics Letters, 2003, 381, 177
Submission of Jobs Folder of Input Files Resource Requests ♯of Cores, Memory, Wall Time Grid Interface Cluster 1 Cluster 2 Cluster 3 Cluster 4 Grid Interface Folder of Output Files
GAMESS General Atomic and Molecular Electronic Structure System • 2 Versions of GAMESS: GAMESS-US (free) and GAMESS-UK (commercial) • multipurpose Quantum Chemistry Package • Energies, Forces, Vibrational Modes on various theory levels, • Properties of Molecules … • used here: Optimization of GS and TS Molecules, • Calculation of Vibrational Modes and SCF-Energies • Distributed Data Interface (DDI): • communication layer for parallel execution of GAMESS • manages dynamic memory allocation and data exchange between • single cores and processes • does I/O-operations for each single calculation process
SCF-Procedure • optimizing shapes of electron clouds => lowest energy • energy of single cloud depends on shapes • of all other clouds due to electrostatic interaction • (and exchange interaction) • Knowledge about all other clouds required to • determine each single one Initial Electron Clouds Optimal Electron Clouds Optimization Eigenvector Determination Summing up No Done Yes ΔE or ΔChargeDist. Below predefined value Total Charge Distribution
Problems with SCF-Runs on large Number of Jobs (e.g. 2000) • 1) Automated Error Treatment: • inappropriate input parameters • allocated resources exceeded • node / cluster crashed • convergence problems of SCF-Procedure • 2) Prediction of Resource Requirements: • Memory, Number of CPUs e.g.: 3 - 25 MB per Core, 16 Cores • Harddisk Space e.g.: 6 – 30546 MB • Wall Clock Time e.g.: 1 – 17 sec • very small molecules (4 – 20 atoms)
Automated Error Treatment 1. Separation of Jobs (successful, failure type 1, 2, …) • Inspection of output files for (error-) messages • e.g.: ---------------------------------------- ddikick.x: exited gracefully. real 0m8.210s user 0m0.003s sys 0m0.006s finish time: Mon Jul 11 22:14:07 CEST 2011 Read from remote host compute-1-412: Connection reset by peer Failed creating /state/partition1/grid027/121792 on compute-1-412 • Sorting them according to found messages
Some more Examples: Initiating 16 compute processes on 7 nodes to run the following command: /share/apps/gamess-2010R1-ethernet-gfortran//gamess.00.x INP00018 ddikick.x: Timed out while waiting for DDI processes to check in. ddikick.x: Fatal error detected. The error is most likely to be in the application, so check for input errors, disk space, memory needs, application bugs, etc. ddikick.x will now clean up all processes, and exit... II,JST,KST,LST =318 1 1 1 NREC = 3768 INTLOC = 5381 PWRT: NODE 0 ENCOUNTERED I/O ERROR WRITING UNIT 8 EXECUTION OF GAMESS TERMINATED -ABNORMALLY- AT Wed Jul 13 10:08:36 2011 4613448 WORDS OF DYNAMIC MEMORY USED CPU 0: STEP CPU TIME= 28.41 TOTAL CPU TIME= 57.7 ( 1.0 MIN) TOTAL WALL CLOCK TIME= 95.6 SECONDS, CPU UTILIZATION IS 60.40% DDI Process 0: error code 911 ddikick.x: application process 0 quit unexpectedly. ddikick.x: Fatal error detected. The error is most likely to be in the application, so check for input errors, disk space, memory needs, application bugs, etc. ddikick.x will now clean up all processes, and exit...
Automated Error Treatment 2. Case Specific Treatment 1) Successful Job Extraction of SCF-Energy 2) Inappropriate input parameters correction by human being e.g.: “ILLEGAL …” in output file 3) Allocated Resources exceeded Resubmission of Job with larger Resources How to avoid repetition of entire computation (e.g. SCF-Procedure)? 4) node / cluster crashed Resubmission of Job on another cluster (?)
Automated Error Treatment Convergence Problems How to make the computer recognizing Convergence Problems?
Prediction of Resource Requirements 1. Memory and Number of Cores Node • Allocated for each Core: • Memory for Replicated Data(MEMreplic) • Memory for Shared Data(MEMshared) Core 1 Core 2 SCF-Run SCF-Run • Using several cores to accumulate • enough total memory: • MEMshared + MEMreplic < available MEMCore • MEMshared = total MEMshared / NCores DDI-Process HDD available MEMCore = total MEMCore – MEMGAMESS+OS e.g. : MEMGAMESS+OS ≈ 50 MB + 25 MB • more cores faster run • fewer cores memory overflow
Prediction of Resource Requirements 2. Speed up with Number of Cores • Communication limits linear speed up: SCF-Energy for Molecule with paired electrons 1 / Wall Clock Time Vibrational Modes for Molecule with unpaired electrons NCores • Waiting time in Queuing System
Prediction of Resource Requirements 3. Hard Disk Storage (HD): • Each node with its own HD (2 TB) • accumulating HD space for single job using more nodes, i.e. cores • if HD space exceeded on 1 node entire job crashes - in the best case Otherwise System crashes, if GAMESS is not restricted to a Scratch Partition • Mainly Overlap- and Interaction Integrals of electron clouds of isolated atoms • (at least in energy calculations) required HD ≈ NInt BytesInt max Nint = 1/sym 1/8 NBasisfunctions • But: Lots of Integrals almost =0 and neglected by GAMESS • actual NInt << max NInt • fraction of ignored Integrals strongly depends on molecule
Prediction of Resource Requirements Check-Run Modus of GAMESS: Input File with all Settings One additional: Exetyp = Check GAMESS Output File with Estimates of MEMreplic , total MEMshared , actual Nint And some more stuff • Reliability not assessed yet How to test Check-Run Mode of GAMESS systematically?
Prediction of Resource Requirements 4. Wall Clock Time: • Depends on: • Level of Theory (RHF ≤ DFT < MP2 < CC… << Full-CI) • Size of Basis Set constituting the Electron Clouds • Number and Size of Atoms in Molecule • And: • Number of required SCF-Cycles for given Molecule • (almost) impossible to predict Benchmarks necessary to obtain Empirical Function for Wall Clock Time: WCT = f(input parameters)
2 Problems for you • Predicting Memory-, Core- and HD-Space Requirements: • Testing Check-Run Mode of GAMESS systematically • Automated Recognition of Convergence Problems • in SCF-Energy Calculation
Thank you for your Attention Suggestions?