1 / 16

HPC Software Development at LLNL

HPC Software Development at LLNL. Presented to College of St. Rose, Albany. Feb. 11, 2013. Todd Gamblin Cente r for Applied Scientific Computing. LLNL has some of the world’s largest supercomputers. Sequoia #1 in the world, June 2012 IBM Blue Gene/Q 96 racks, 98,304 nodes

joann
Download Presentation

HPC Software Development at LLNL

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. HPC Software Development at LLNL Presented to College of St. Rose, Albany • Feb. 11, 2013 • Todd Gamblin • Center for Applied Scientific Computing

  2. LLNL has some of the world’s largest supercomputers • Sequoia • #1 in the world, June 2012 • IBM Blue Gene/Q • 96 racks, 98,304 nodes • 1.5 million cores • 5-D Torus network • Transactional Memory • Runs lightweight, Linux-like OS • Login nodes are Power7, but compute nodes are PowerPC A2 cores. Requires cross-compiling.

  3. LLNL has some of the world’s largest supercomputers • Zin • Intel Sandy Bridge • 2,916 16-core nodes • 45,656 processors • Infiniband Fat Treeinterconnect • Commodity parts • Runs TOSS, LLNL’s Red Hat Linux Distro

  4. LLNL has some of the world’s largest supercomputers • Others • Almost 30 clusters total • See http://computing.llnl.gov

  5. Supercomputers run very large-scale simulations NIF Target • Multi-physics simulations • Material Strength • Laser-Plasma Interaction • Quantum Chromodynamics • Fluid Dynamics • Lots of complicated numericalmethods for solving equations: • Adaptive Mesh Refinement (AMR) • Adaptive Multigrid • Unstructured Mesh • Structured Mesh Supernova AMR Fluid Interface

  6. Structure of the Lab • Code teams • Work on physics applications • Larger code teams are 20+ people • Software developers • Applied mathematicians • Physicists • Work to meet milestones for lab missions

  7. Structure of the Lab • Livermore Computing (LC) • Run supercomputing center • Development Environment Group • Works with application teams toimprove code performance • Knows about compilers, debuggers,performance tools • Develops performance tools • Software Development Group • Develops

  8. Structure of the Lab • Center For Applied ScientificComputing (CASC) • Most CS Researchers are in CASC • Large groups doing: • Performance Analysis Tools • Power optimization • Resilience • Source-to-source Compilers • FPGAs and new architectures • Applied Math and numerical analysis

  9. Performance Tools Research • Write software to measure theperformance of other software • Profiling • Tracing • Debugging • Visualization • Tools themselves need toperform well: • Parallel Algorithms • Scalability and low overhead are important

  10. Development Environment • Application codes are written in many languages • Fortran, C, C++, Python • Some applications have been around for 50+ years • Tools are typically written in C/C++ • Tools typically run as part of an application • Need to be able to link with application environment • Non-parallel parts of tools are often in Python. • GUI • front-end scripts • some data analysis

  11. We’ve started using Atlassiantools for collaboration • http://www.atlassian.com • Confluence Wiki • JIRA Bug Tracker • Stash git repo hosting • Several advantages for our distributed environment: • Scale to lots of users • Fine-grained permissions allow us to stay within our security model

  12. Simple example: Measuring MPI Parallel Application • Parallel Applications use the MPI Library for communication • We want to measure time spent in MPI calls • Also interested in other metrics • Semantics, parameters, etc. • We write a lot of interposer libraries . . . Single Process

  13. Simple example: Measuring MPI • Parallel Applications use the MPI Library for communication • We want to measure time spent in MPI calls • Also interested in other metrics • Semantics, parameters, etc. • We write a lot of interposer libraries

  14. Example Interposer Code intMPI_Bcast(void *buffer, int count, MPI_Datatypedtype, int root, MPI_Commcomm) { double start = get_time_ns(); PMPI_Bcast(buffer, count, dtype, root, comm); double duration = get_time_ns() – start; record_time(MPI_Bcast, duration); } • This call intercepts calls from the application • It does its own measurement • Then it calls the MPI library • Allows us to measure time spent in particular routines

  15. Another type of problem: communication optimization • See other slide set.

More Related