1 / 11

First-Principles Molecular Dynamics for Petascale Computers

First-Principles Molecular Dynamics for Petascale Computers. Fran ç ois Gygi Dept of Applied Science, UC Davis fgygi@ucdavis.edu http://eslab.ucdavis.edu Zhaojun Bai Dept of Computer Science, UC Davis Giulia Galli Dept of Chemistry, UC Davis Kwan-Liu Ma Dept of Computer Science, UC Davis.

courtney
Download Presentation

First-Principles Molecular Dynamics for Petascale Computers

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. First-Principles Molecular Dynamics for Petascale Computers François Gygi Dept of Applied Science, UC Davis fgygi@ucdavis.edu http://eslab.ucdavis.edu Zhaojun Bai Dept of Computer Science, UC Davis Giulia Galli Dept of Chemistry, UC Davis Kwan-Liu Ma Dept of Computer Science, UC Davis Supported by NSF-ITR-HECURA 0749217

  2. The Qbox project • Qbox is a C++/MPI implementation of First-Principles Molecular Dynamics (FPMD) • Qbox includes a quantum mechanical description of electronic structure within Density Functional Theory • Applications to Materials Science, Chemistry, Nanoscience • Software development focuses on large-scale parallelism

  3. Qbox code architecture Qbox ScaLAPACK/PBLAS XercesC (XML parser) BLAS/ATLAS BLACS FFTW lib DGEMM lib MPI http://eslab.ucdavis.edu/software/qbox

  4. Qbox performance results 1 k-point: 108.8 TFlop/s (30% of peak) • Electronic structure of a 1000-atom Molybdenum sample • 12,000 electrons • LLNL BlueGene/L 4 k-points: 187.7 TFlop/s (51% of peak) 8 k-points: 207.3 TFlop/s (56% of peak) 2006 ACM/IEEE Gordon Bell Award for peak performance

  5. Current Qbox availability on Teragrid Platforms • Mercury, NCSA • Cobalt, NCSA • Tungsten, NCSA • BlueGene/L, SDSC • IBM p655, SDSC Other platforms • ANL BG/L • ANL BG/P • NERSC Franklin, Cray XT4 • NCSA Abe

  6. New scalable algorithms for electronic structure calculations • One-sided Jacobi simultaneous diagonalization algorithm used in electronic structure calculations • 64-node dual-dual-core AMD Opteron/Infinipath cluster • 1 rack ANL BlueGene/L

  7. Qbox scalability for nanoscience applications • Electronic structure of a 2260-atom silicon nanowire • Cray-XT4, up to 8k CPUs • Superlinear scaling due to cache effects and size-dependent MPI protocols • 86% parallel efficiency between 2k and 8k CPUs

  8. Qbox parallel I/O strategy • Advanced functions in MPI-IO are not supported by all file systems (MPI_File_write_shared, etc.) • Qbox uses a strategy based on shared file pointer objects • Achieves >700 MB/s write rate for file sizes of 50–250 GB

  9. Analysis of MPI message traffic patterns in Qbox • Multiple traffic patterns are involved during a Qbox simulation • physics kernels • 3D Fourier transforms • ScaLAPACK linear algebra • Logical-to-physical mapping of tasks has a large impact on performance on large platforms (> 4k CPUs) • We are developing instrumentation and visualization tools to analyze message traffic patterns on various interconnect architectures Mapping of 65536 MPI tasks on the 32x32x64 torus of the LLNL BG/L

  10. Analysis of MPI message traffic patterns in Qbox • Screenshot of the message traffic visualization tool showing MPI calls in a ScaLAPACK matrix multiplication (C. Muelder, K-L Ma, UCDavis)

  11. Qbox current developments • Deployment on TeraGrid track-2 platforms • Applications to Nanoscience simulations • G. Galli, Chemistry UCDavis • Specialized linear algebra algorithms • Z. Bai, Computer Science, UCDavis • Visualization • K-L. Ma, Computer Science, UCDavis • Application-specific data compression algorithms • Large dataset management (1010 – 1012 bytes) • XML standards for electronic structure data (http://www.quantum-simulation.org) http://eslab.ucdavis.edu Supported by NSF-ITR-HECURA 0749217

More Related