1 / 23

M ultiresolution Ad aptive N um e rical S cientific S imulation

M ultiresolution Ad aptive N um e rical S cientific S imulation. Ariana Beste 1 , George I. Fann 1 , Robert J. Harrison 1,2 , Rebecca Hartman-Baker 1 , Shinichiro Sugiki 1 1 Oak Ridge National Laboratory 2 University of Tennessee, Knoxville In collaboration with

kynton
Download Presentation

M ultiresolution Ad aptive N um e rical S cientific S imulation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Multiresolution Adaptive Numerical Scientific Simulation Ariana Beste1,George I. Fann1,Robert J. Harrison1,2, Rebecca Hartman-Baker1, Shinichiro Sugiki11Oak Ridge National Laboratory 2University of Tennessee, Knoxville In collaboration with Gregory Beylkin4, Fernando Perez4, Lucas Monzon4, Martin Mohlenkamp5 and others 4University of Colorado5Ohio University harrisonrj@ornl.gov

  2. The DOE funding • This work is funded by the U.S. Department of Energy, the division of Basic Energy Science, Office of Science, under contract DE-AC05-00OR22725 with Oak Ridge National Laboratory. This research was performed in part using • resources of the National Energy Scientific Computing Center which is supported by the Office of Energy Research of the U.S. Department of Energy under contract DE-AC03-76SF0098, • and the Center for Computational Sciences at Oak Ridge National Laboratory under contract DE-AC05-00OR22725 .

  3. Outline • Multiresolution basics • Parallel decomposition and tools • Underlying representation • Application characteristics • Current storage strategy

  4. Molecular Science Software Project EMSL / PNNL PNNL Yuri Alexeev, Eric Bylaska, Bert deJong, Mahin Hackler, Karol Kowalski, Lisa Pollack, Tjerk Straatsma, Marat Valiev, ORNL Edo Apra, Robert Harrison Vincent Meunier Ames Ricky Kendall TL Windus Gary Black, Brett Didier, Todd Elsenthagen, Sue Havre, Carina Lansing, Bruce Palmer, Karen Schuchardt, Lisong Sun Erich Vorpagel Manoj Krishnan, Jarek Nieplocha, Bruce Palmer, Vinod Tipparaju http://www.emsl.pnl.gov/docs/nwchem/nwchem.html

  5. Computational Chemistry EndstationInternational collaboration spanning 8 universities and 5 national labs • Led out of UT/ORNL • Focus • Actinides, Aerosols, Catalysis • ORNL Cray XT3, ANL BG/L • Capabilties: • Chemically accurate thermochemistry • Many-body methods required • Mixed QM/QM/MM dynamics • Accurate free-energy integration • Simulation of extended interfaces • Families of relativistic methods • NWChem: Largest CCSD(T) calculation • - Pollack, EMSL, 2005. • - 1960 processor Itanium2 cluster • 1468 basis functions (aug-cc-pVQZ) • Perturbative triples (T) • 23 hours on 1400 processors • 75% of peak = 6.3 TFlops. Scaling of MADNESS 64-4096 cpu on XT3

  6. Multiresolution chemistry objectives • Complete elimination of the basis error • One-electron models (e.g., HF, DFT) • Pair models (e.g., MP2, CCSD, …) • Correct scaling of cost with system size • General approach • Readily accessible by students and researchers • Higher level of composition • Direct computation of chemical energy differences • New computational approaches • Fast algorithms with guaranteed precision

  7. How to “think” multiresolution • Consider a ladder of function spaces • E.g., increasing quality atomic basis sets, or finer resolution grids, … • Telescoping series • Instead of using the most accurate representation, use the difference between successive approximations • Representation on V0 small/dense; differences sparse • Computationally efficient; many possible insights

  8. High-level composition using functions and operators • Conventional quant. chem. uses explicitly indexed sparse arrays of matrix elements • Complex, tedious and error prone • Python classes for Function and Operator • in 1,2,3,6 and general dimensions • wide range of operations Hpsi = -0.5*Delsq*psi+ V*psi J = Coulomb.apply(rho) • All with guaranteed speed and precision

  9. New MADNESS solver • Total rewrite in C++ • Three levels of parallelism targeting massively parallel computer using multi-processor nodes • In anticipation of highly-threaded processors • Ideally targets low latency AM+MPI+threads • Portable implementation polling+MPI+threads • Core math functionality is now running • 3D functions, real and complex (1-6D functions will be added this FYI) • Scaling demonstrated up to 4096 processors – designed for 100+K.

  10. 1-D Example Sub-Tree Parallelism 0 1 2 3 4 5 6 Both sub-trees can be done in parallel. In 3-D nodes split into 8 children … in 6-D there are 64 children

  11. Distributed-memory Cilk-like model Parameter: MPI rank probe() set() get() Task: Input parameters Output parameters probe() run() Compress(tree,result): Parameter left, right if (tree.left) Compress(tree.left, left) if (tree.right) Compress(tree.right, right) AddTask(Op, left, right, result) WaitTasks() Benefits: Most receives pre-posted greatly increasing scalability Communication latency & transfer time largely hidden Much simpler composition than explicit message passing Positions code to use “intelligent” runtimes with work stealing Positions code for efficient use of multi-core chips

  12. Essential techniques for fast computation • Multiresolution • Low-separation rank • Low-operator rank

  13. Separated representations • Key to computing in higher dimensions • Analogs of SVD exploit low operator rank • Generalized form exploits other operator properties • E.g., these all have full operator rank but low-separation rank constructions exist • Identity operator • Green’s functions of many PDEs (Poisson, Helmholtz) • All-electron Schrödinger Hamiltonian

  14. x x-y |x-y| r = separation rank |x-y| x-y In 3D, ideally mustbe one box removedfrom the diagonalDiagonal box hasfull rank Boxes touching diagonal (face, edge,or corner) have increasingly low rank Away from diagonalr = O(-log e) y-x |x-y| y |x-y| x-y |x-y| y-x |x-y| y-x |x-y|

  15. Molecular electronic Schrödinger equation • A 3-N dimensional, non-separable, second-order differential equation

  16. Dynamics of fundamental few electron systems (Krstic and Harrison) • Electron+atom/molecule scatteringMolecules in intense radiation field • Challenges • Scattering – highly oscillatory states • Dissociation – continuum states • Quantum treatment of light nuclei • Rydberg states – very large volumes • In principle, adaptive multiresolution techniques are ideal • Single basis treats bound and continuum states on equal footing • Long time steps possible via integral operator for time evolution • Separated representations provide path to higher dimensions • Waiting for new production code before can apply free-particle propagator efficiently for implicit scheme (integral kernel is exp(-ix2/2t) ) • Need a more strongly band limited basis? • Want to do this in at least 5-9D, 12D being considered

  17. -0.53 -1.31 -0.67 -20.44 -0.48 “Independent” particle models • Atomic and molecular orbitals • Each electron feels the mean field of all other electrons (self-consistent field, Hartree-Fock) • Replaces linear 3N-D Schrödinger w. non-linear 3-D eigen-problem • Provides the structure of the periodic table and the chemical bond • Linear combination of atomic orbitals - LCAO • E.g., molecular orbitals for water, H2O

  18. Density functional theory (DFT) • Hohenberg-Kohn theorem • The energy is a functional of the density (3D) • Kohn-Sham • Practical approach to DFT, parameterizing the density with orbitals (easier treatment of kinetic energy) • Very similar computationally to Hartree-Fock, but potentially exact

  19. Reduced scaling method • Eigen-functions (canonical orbitals) can be delocalized • Limits to O(VN) data and O(VN2) compute • Solve instead for localized orbitals that span the same space • Limits to O(NlnV) data and compute • Multiresolution representation makes this easy • Remaining linear algebra has small pre-factor and is sparse

  20. Current I/O Strategy • Looked seriously at HDF and Phil’s API • Substantial effort for adoption; HDF perf. questions • Substantial benefits from interoperability • Short-term driver is check point restart • Tunable subset of nodes doing I/O • Currently nodes at a level in tree (in 3D 1, 8, 64, …) • Collect data from other nodes • Serialize to disk in either binary or text (XML) • Already want interfaces to viz. tools • Starting to consider interface to external solvers • Sundance, PetSc, …

  21. Summary of MADNESS data • Discontinuous spectral element • Legendre polynomials, or • Approximate prolate spheroidal functions • Structured, deeply-refined, adaptive mesh • In higher-dimensions • Separated representations in most elements • Mix of data types • Float, double, float-complex, double-complex • 100s to 10Ks of distinct functions in 3D • 10s of Gb to 10s of Tb of data • Few functions in 6+D • 100s of Gb to 10s of Tb

More Related