1 / 35

Cactus 4.0

Cactus 4.0. Cactus Computational Toolkit and Distributed Computing. Solving Einstein’s Equations Impact on computation Large collaborations essential and difficult! Code becomes the collaborating tool. Cactus, a new community code for 3D GR-Astrophysics

joel-holden
Download Presentation

Cactus 4.0

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Cactus 4.0

  2. Cactus Computational Toolkit and Distributed Computing • Solving Einstein’s Equations • Impact on computation • Large collaborations essential and difficult! • Code becomes the collaborating tool. • Cactus, a new community code for 3D GR-Astrophysics • Toolkit for many PDE systems • Suite of solvers for Einstein system • Metacomputing for the general user • Distributed computing experiments with Cactus and Globus Gabrielle Allen, Ed Seidel Albert-Einstein-Institut MPI-Gravitationsphysik

  3. Einstein’s Equations and Gravitational Waves s(t) h = ds/s ~ 10-22 ! Colliding BH’s and NS’s... • Einstein’s General Relativity • Fundamental theory of Physics (Gravity) • Black holes, neutron stars, gravitational waves, ... • Among most complex equations of physics • Dozens of coupled, nonlinear hyperbolic-elliptic equations with 1000’s of terms • New field: Gravitational Wave Astronomy • Will yield new information about the Universe • What are gravitational waves? “Ripples in the curvature of spacetime” • A last major test of Einstein’s theory: do they exist? • Eddington: “Gravitational waves propagate at the speed of thought” • 1993 Nobel Prize Committee: Hulse-Taylor Pulsar (indirect evidence)

  4. Detecting Gravitational Gravitational Waves • LIGO, VIRGO (Pisa), GEO600,…$1 Billion Worldwide • We need results from numerical relativity to: • Detect them…pattern matching against numerical templates to enhance signal/noise ratio • Understand them…just what are the waves telling us? Hanford Washington Site 4km

  5. Teraflop computation, AMR, elliptic-hyperbolic, ??? Merger Waveform Must Be Found Numerically

  6. Axisymmetric Black Hole Simulations: Cray C90 Collision of two Black Holes (“Misner Data”) Evolution of Highly Distorted Black Hole

  7. Computational Needs for 3D Numerical Relativity • Finite Difference Codes ~ 104 Flops/zone/time step ~ 100 3D arrays • Currently use 2503 ~ 15 GBytes ~ 15 TFlops/time step • Need 10003 zones ~1000 GBytes ~1000 TFlops/time step • Need TFlop, TByte machine • Need Parallel AMR, I/O t=100 t=0 • Initial Data: 4 couple nonlinear elliptics • Time step update • explicit hyperbolic update • also solve elliptics

  8. Mix of Varied Technologies and Expertise! • Scientific/Engineering: • formulation of equations, equation of state, astrophysics, hydrodynamics ... • Numerical Algorithms: • Finite differences? Finite elements? Structured meshes? • Hyperbolic equations: explicit vs implicit, shock treatments, dozens of methods (and presently nothing is fully satisfactory!) • Elliptic equations: multigrid, Krylov subspace, spectral, preconditioners (elliptics currently require most of the time…) • Mesh Refinement? • Computer Science: • Parallelism (HPF, MPI, PVM, ???) • Architecture Efficiency (MPP, DSM, Vector, NOW, ???) • I/O Bottlenecks (generate gigabytes per simulation, checkpointing…) • Visualization of all that comes out!

  9. Clearly need huge teams, with huge expertise base to attack such problems… • … in fact need collections of communities • But how can they work together effectively? • Need a code environment that encourages this…

  10. NSF Black Hole Grand Challenge Alliance • University of Texas (Matzner, Browne) • NCSA/Illinois/AEI (Seidel, Saylor, Smarr, Shapiro, Saied) • North Carolina (Evans, York) • Syracuse (G. Fox) • Cornell (Teukolsky) • Pittsburgh (Winicour) • Penn State (Laguna, Finn) Develop Code To Solve Gmn = 0

  11. NASA Neutron Star Grand Challenge • NCSA/Illinois/AEI (Saylor, Seidel, Swesty, Norman) • Argonne (Foster) • Washington U (Suen) • Livermore (Ashby) • Stony Brook (Lattimer) “A Multipurpose Scalable Code for Relativistic Astrophysics” Develop Code To Solve Gmn = 8pTmn

  12. What we learn from Grand Challenges • Successful, but also problematic… • No existing infrastructure to support collaborative HPC • Many scientists are Fortran programmers, and NOT computer scientists • Many sociological issues of large collaborations and different cultures • Many language barriers … … Applied mathematicians, computational scientists, physicists have very different concepts and vocabularies… • Code fragments, styles, routines often clash • Successfully merged code (after years) often impossible to transplant into more modern infrastructure (e.g., add AMR or switch to MPI…) • Many serious problems … this is what the Cactus Code seeks to address

  13. What Is Cactus? • Cactus was developed as a general, computational framework for solving PDEs (originally in numerical relativity and astrophysics) • Modular … for easy development, maintenance and collaborations. Users supply “thorns” which plug-into compact core “flesh” • Configurable … thorns register parameter, variable and scheduling information with “runtime function registry” (RFR). Object-orientated inspired features • Scientist friendly … thorns written in F77, F90, C, C++ • Accessible parallelism … driver layer (thorn) is hidden from physics thorns by a fixed flesh interface

  14. What Is Cactus? • Standard interfaces … interpolation, reduction, IO, coordinates. Actual routines supplied by thorns • Portable … Cray T3E, Origin, NT/Win9*, Linux, O2, Dec Alpha, Exemplar, SP2 • Free and open community code … distributed under the GNU GPL. Uses as much free software as possible • Up-to-date … new computational developments and/or thorns immediately available to users (optimisations, AMR, Globus, IO) • Collaborative … thorn structure makes it possible for large number of people to use and development toolkits … the code becomes the collaborating tool • New version … Cactus beta-4.0 released 30th August

  15. Core Thorn Arrangements Provide Tools • Parallel drivers (presently MPI-based) • (Mesh refinement schemes: Nested Boxes, DAGH, HLL) • Parallel I/O for Output, Filereading, Checkpointing (HDF5, FlexIO, Panda, etc…) • Elliptic solvers (Petsc, Multigrid, SOR, etc…) • Interpolators • Visualization Tools (IsoSurfacer) • Coordinates and boundary conditions • Many relativity thorns • Groups develop their own thorn arrangements to add to these

  16. Cactus 4.0 Boundary CartGrid3D WaveToyF77 WaveToyF90 PUGH FLESH (Parameters, Variables, Scheduling) GrACE IOFlexIO IOHDF5

  17. Current Status • It works: many people, with different backgrounds, different personalities, on different continents, working together effectively on problems of common interest. • Dozens of physics/astrophysics and computational modules developed and shared by “seed” community • Connected modules work together, largely without collisions • Test suites used to ensure integrity of both code and physics • How to get it … • Workshop 27 Sept - 1 Oct NCSA http://www.ncsa.uiuc.edu/SCD/Training/ Movie from Werner Benger, ZIB

  18. Near Perfect Scaling • Excellent scaling on many architectures • Origin up to 128 processors • T3E up to 1024 • NCSA NT cluster up to 128 processors • Achieved 142 Gflops/s on 1024 node T3E-1200 (benchmarked for NASA NS Grand Challenge)

  19. Many Developers: Physics & Computational Science DAGH/AMR (UTexas) AEI NCSA FlexIO ZIB Wash. U HDF5 NASA SGI Valencia Petsc (Argonne) Globus (Foster) Panda I/O (UIUC CS)

  20. Metacomputing: harnessing power when and where it is needed • Easy access to available resources • Find Resources for interactive use: Garching? ZIB? NCSA? SDSC? • Do I have an account there? What’s the password? • How do get executable there? • Where to store data? • How to launch simulation. What are local queue structure/OS idiosyncracies?

  21. Metacomputing: harnessing power when and where it is needed • Access to more resources • Einstein equations require extreme memory, speed • Largest supercomputers too small! • Networks very fast! • DFN gigabit testbed: 622 Mbits Potsdam-Berlin-Garching, connect multiple supercomputers • Gigabit networking to US possible • Connect workstations to make supercomputer

  22. Metacomputing: harnessing power when and where it is needed • Acquire resources dynamically during simulation! • Need more resolution in one area • Interactive visualization, monitoring and steering from anywhere • Watch simulation as it progresses … live visualisation • Limited bandwidth: compute vis. online with simulation • High bandwidth: ship data to be visualised locally • Interactive Steering • Are parameters screwed up? Very complex? • Is memory running low? AMR! What to do? Refine selectively or acquire additional resources via Globus? Delete unnecessary grids?

  23. Metacomputing: harnessing power when and where it is needed • Call up an expert colleague … let her watch it too • Sharing data space • Remote collaboration tools • Visualization server: all privileged users can login and check status/adjust if necessary

  24. Globus: Can provide many such services for Cactus • Information (Metacomputing Directory Service: MDS) • Uniform access to structure/state information: Where can I run Cactus today? • Scheduling (Globus Resource Access Manager: GRAM) • Low-level scheduler API: How do I schedule Cactus to run at NCSA? • Communications (Nexus) • Multimethod communication + QoS management: How do I connect Garching and ZIB together for a big run? • Security (Globus Security Infrastructure) • Single sign-on, key management: How do I get authority at SDSC for Cactus?

  25. Globus: Can provide many such services for Cactus • Health and status (Heartbeat monitor): Is my Cactus run dead? • Remote file access (Global Access to Secondary Storage: GASS): How do I manage my output, and get executable to Argonne?

  26. Colliding Black Holes and MetaComputing: German Project supported by DFN-Verein • Solving Einstein’s Equations • Developing Techniques to Exploit High Speed Networks • Remote Visualization • Distributed Computing Across OC-12 Networks between AEI (Potsdam), Konrad-Zuse-Institut (Berlin), and RZG (Garching-bei-München) AEI

  27. Distributing Spacetime: SC’97 Intercontinental Metacomputing at AEI/Argonne/Garching/NCSA Immersadesk 512 Node T3E

  28. Metacomputing the Einstein Equations:Connecting T3E’s in Berlin, Garching, San Diego

  29. Collaborators • A distributed astrophysical simulation involving the following institutions: • Albert Einstein Institute (Potsdam, Germany) • Washington University St. Louis, MO. • Argonne National Laboratory (Chicago, IL) • NLANR Distributed Applications Team (Champaign, IL) • The following supercomputer centers: • San Diego Supercomputer Center (268 proc. T3E) • Konrad-Zuse-Zentrum in Berlin (232 proc. T3E) • Max-Planck-Institute in Garching (768 proc. T3E)

  30. The Grand Plan • Distribute simulation across 128 PE’s of SDSC T3E and 128 PE’s of Konrad-Zuse-Zentrum T3E in Berlin, using Globus • Visualize isosurface data in real-time on Immersadesk in Orlando • Transatlantic bandwidth from an OC-3 ATM network San Diego Berlin

  31. SC98 Neutron Star Collision Movie from Werner Benger, ZIB

  32. Cactus scaling across PE’s(Jason Novotny, NLANR)

  33. Analysis of metacomputing experiments • It works! (That’s the main thing we wanted at SC98…) • Cactus not optimized for metacomputing: messages too small, lower MPI bandwidth, could be better: • ANL-NCSA • Measured bandwidth 17Kbits/sec (small) --- 25Mbits/sec (large) • Latency 4ms • Munich-Berlin • Measured bandwidth 1.5Kbits/sec (small) --- 4.2Mbits/sec (large) • Latency 42.5ms • Within single machine: Order of magnitude better • Bottom Line: • Expect to be able to improve performance significantly • Can run much larger jobs on multiple machines • Start using Globus routinely for job submission

  34. The Dream: not far away... Physics Module 1 BH Initial Data Cactus/Einstein solver MPI, MG, AMR, DAGH, Viz, I/O, ... Budding Einstein in Berlin... Globus Resource Manager Mass storage Ultra 3000: Whatever-Wherever Garching T3E NCSA Origin 2000 array

  35. Cactus 4.0 Credits • Toni Arbona • Carles Bona • Steve Brandt • Bernd Bruegmann • Thomas Dramlitsch • Ed Evans • Carsten Gundlach • Gerd Lanferman • Lars Nerger • Mark Miller • Hisaaki Shinkai • Ryoji Takahashi • Malcolm Tobias • Vision and Motivation • Bernard Schutz • Ed Seidel "the Evangelist" • Wai-Mo Suen • Cactus flesh and design • Gabrielle Allen • Tom Goodale • Joan Massó • Paul Walker • Computational toolkit • Flesh authors • Gerd Lanferman • Thomas Radke • John Shalf • Development toolkit • Bernd Bruegmann • Manish Parashar • Many others • Relativity and astrophysics • Flesh authors • Miguel Alcubierre

More Related