1 / 30

A Report on the ACTS Toolkit (acts-support@nersc)

A Report on the ACTS Toolkit (acts-support@nersc.gov). Osni Marques and Tony Drummond (LBNL/NERSC) oamarques@lbl.gov , ladrummond@lbl.gov. What is the ACTS Toolkit?. http://acts.nersc.gov. information center. A dvanced C omputational T esting and S imulation

rkeller
Download Presentation

A Report on the ACTS Toolkit (acts-support@nersc)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Report on theACTS Toolkit(acts-support@nersc.gov) Osni Marques and Tony Drummond (LBNL/NERSC) oamarques@lbl.gov, ladrummond@lbl.gov

  2. What is the ACTS Toolkit? http://acts.nersc.gov information center • Advanced Computational Testing and Simulation • Tools for developing parallel applications • developed (primarily) at DOE Labs • ~ 20 tools • ACTS is an “umbrella” project • leverage numerous independently funded projects • collect tools in a toolkit A Report on the ACTS Toolkit

  3. ACTS: Project Goals • Extended support for experimental software • Make ACTS tools available on DOE computers • Provide technical support (acts-support@nersc.gov) • Maintain ACTS information center (http://acts.nersc.gov) • Coordinate efforts with other supercomputing centers • Enable large scale scientific applications • Educate and train A Report on the ACTS Toolkit

  4. What needs to be computed? ScaLAPACK Aztec/Trilinos SuperLU PDEs ODEs PVODE PETSc Hypre TAO A Report on the ACTS Toolkit

  5. What codes are being developed? TAU Overture SILOON POOMA PAWS PETE CUMULVS Global Arrays Expression templates for C++ Parallel programs that use large distributed arrays Infrastructure for distributed computing Coupling distributed applications Operations with grids for PDE applications Interactive visualization Scripting interface for C++ numerics Performance analysis and monitoring A Report on the ACTS Toolkit

  6. ACTS: levels of support • High • Intermediate level • Tool expertise • Conduct tutorials • Intermediate • Basic level • Provide a higher level of support to users of the tool • Basic • Basic knowledge of the tools • Help with installation • Compilation of user’s reports (acts-support@nersc.gov) A Report on the ACTS Toolkit

  7. ACTS Tools Installed on NERSC Computers See also http://acts.nersc.gov/tools A Report on the ACTS Toolkit

  8. Aztec http://acts.nersc.gov/aztec Solves large sparse systems of linear systems on distributed memory machines Implements Krylov iterative methods (CG, CGS, Bi-CG-Stab, GMRES, TFQMR) Suite of preconditioners (Jacobi, Gauss-Seidel, overlapping domain decomposition with sparse LU, ILU, BILU within domains) Highly efficient, scalable (1000 processors on the “ASCI Red” machine) A Report on the ACTS Toolkit

  9. AztecOO/Trilinos • Trilinos encompasses efforts in linear solvers, eigensolvers, nonlinear and time-dependent solvers, and others. • Provides a common framework for current and future solver projects: • A common set of concrete linear algebra objects for solver development and application interfaces. • A consistent set of solver interfaces via abstract classes (API) . • AztecOO improves on Aztec by: • Using objects for defining matrix and RHS. • Providing more preconditioners/scalings. • Using C++ class design to enable more sophisticated use. • AztecOO interfaces allows: • Continued use of Aztec for functionality. • Introduction of new solver capabilities outside of Aztec. A Report on the ACTS Toolkit

  10. CUMULVS http://acts.nersc.gov/cumulvs • Collaborative User Migration, User Library for Visualization and Steering • Enables parallel programming with the integration of: • Interactive visualization (local and remote) • Multiple views • Fault Tolerance • Computational Steering A Report on the ACTS Toolkit

  11. CUMULVS A Report on the ACTS Toolkit

  12. Hypre http://acts.nersc.gov/hypre • Before writing your code: • choose a conceptual interface • choose a solver / preconditioner • choose a matrix type that is compatible with your solver / preconditioner and conceptual interface • Now write your code: • build auxiliary structures (e.g., grids, stencils) • build matrix/vector through conceptual interface • build solver/preconditioner • solve the system • get desired information from the solver A Report on the ACTS Toolkit

  13. Hypre: Interfaces Linear System Interfaces Linear Solvers GMG, ... FAC, ... Hybrid, ... AMGe, ... ILU, ... Data Layout structured composite block-struc unstruc CSR Multiple interfaces are necessary to provide “best” solvers and data layouts A Report on the ACTS Toolkit

  14. PETSc http://acts.nersc.gov/petsc • Portable, Extensible Toolkit for Scientific Computing • What can it do? • Support the development of parallel PDE solvers • Implicit or semi-implicit solution methods, finite element, finite difference, or finite volume type discretizations. • Specification of the mathematics of the problem • Vectors (field variables) and matrices (operators) • How to solve the problem? • Linear, non-linear, and time-stepping (ODE) solvers A Report on the ACTS Toolkit

  15. PETSc: Features • Parallelism • Uses MPI • Data Layout: structure and unstructured meshes • Partitioning and coloring • Viewers • Printing Data Object information • Visualization of a field and matrix data • Profiling and performance Tuning • -log_summary • Profiling by stages of an application • User define events A Report on the ACTS Toolkit

  16. PVODE http://acts.nersc.gov/pvode PVODE actually refers to a trio of closely related solvers: • PVODE, for systems of ordinary differential equations • KINSOL, for systems of nonlinear algebraic equations • IDA, for systems of differential-algebraic equations PVODE has been evolved into SUNDIALS (SUite of Nonlinear and Differential/ALgebraic equation Solvers) A Report on the ACTS Toolkit

  17. ScaLAPACK ScaLAPACK PBLAS Global Local LAPACK BLACS platform specific BLAS PVM/MPI/... http://acts.nersc.gov/scalapack Version 1.7 released in August 2001. Parallel BLAS. Linear systems, least squares, singular value decomposition, eigenvalues. Communication routines targeting linear algebra operations. Clarity,modularity, performance and portability. Atlas can be used for automatic tuning. A Report on the ACTS Toolkit

  18. ScaLAPACK: Goals • Efficiency • Optimized computation and communication engines • Block-partitioned algorithms (Level 3 BLAS) for good node performance • Reliability • Whenever possible, use LAPACK algorithms and error bounds. • Scalability • As the problem size and number of processors grow • Replace LAPACK algorithm that did not scale (new ones into LAPACK) • Portability • Isolate machine dependencies to BLAS and the BLACS • Flexibility • Modularity: build rich set of linear algebra tools (BLAS, BLACS, PBLAS) • Ease-of-Use • Calling interface similar to LAPACK A Report on the ACTS Toolkit

  19. SuperLU http://acts.nersc.gov/superlu • Solves Ax=b on by sparse Gaussian elimination • Sequential, SMP and distributed memory (MPI) implementations • Suitable for general sparse A, nonsymmetric, real or complex • Performance depends strongly on • Sparsity structure, good if (number of flops) / (number of nonzeros) is large • Ordering of equations and unknowns (controls fill-in, parallelism) A Report on the ACTS Toolkit

  20. Distributed SuperLU: Performance Highlights • Uses static instead of dynamic pivoting to be as scalable as Cholesky • Performance on a 512 processor Cray T3E • 10.2 Gflops for MIXING-TANK, fluid flow, n = 29957, nonzeros/row = 67 • 8.4 Gflops for ECL32, device simulation, n = 51993, nonzeros/row = 7.3 • 2.5 Gflops for BBMAT, fluid flow, n = 38744, nonzeros/row = 46 (20% parallel efficiency) • Used to solve open Quantum Mechanics problem (Science, 24 Dec 1999) • n = 736 K on 64 PEs, Cray T3E in 5.7 minutes • n = 1.8 M on 24 PEs, ASCI Blue Pacific in 24 minutes A Report on the ACTS Toolkit

  21. TAO http://acts.nersc.gov/tao • Toolkit for Advanced Optimization • Object-oriented techniques • Component-based interaction • Leverage of existing parallel computing infrastructure • Reuse of external toolkits • Algorithms for: • Unconstrained optimization • Bound-constrained optimization • Linearly constrained optimization • Nonlinearly constrained optimization A Report on the ACTS Toolkit

  22. TAO: interfaces A Report on the ACTS Toolkit

  23. TAU http://acts.nersc.gov/tau Profiling of Java, C++, C, and Fortran codes Detailed information (much more than prof/gprof) Profiles for each unique template instantiation Time spent exclusively and inclusively in each function Start/Stop timers Profiling data maintained for each thread, context, and node Parallel IO Statistics for the number of calls for each profiled function Profiling groups for organizing and controlling instrumentation Support for using CPU hardware counters (PAPI) Graphic display for parallel profiling data Graphical display of profiling results (built-in viewers, interface to Vampir) A Report on the ACTS Toolkit

  24. TAU: Control Windows • COSY: COmpile manager Status displaY • FANCY: File ANd Class displaY • SPIFFY: Structured Programming Interface and Fancy File displaY • CAGEY: CAll Graph Extended displaY • CLASSY: CLASS hierarchY browser • RACY: Routine and data ACcess profile displaY • SPEEDY: Speedup and Parallel Execution Extrapolation DisplaY A Report on the ACTS Toolkit

  25. Why do we need these tools? 2001 7000 ASCI White Pacific 6000 (7424) 5000 4000 Intel ASCI GFlop/s Red Xeon ASCI Blue 3000 (9632) Pacific SST Intel (5808) ASCI Red 2000 SGI ASCI (9152) Blue Hitachi Intel Fujitsu TMC 1000 CP-PACS Paragon Mountain 1997 NEC TMC VPP-500 CM-5 (2040) (6788) Fujitsu SX-3 Cray CM-2 (140) (1024) (5040) VP-2600 (4) Y-MP (8) 1992 (2048) 0 1990 1992 1994 1996 1998 2000 Year A computation that took 1 full year to complete in 1980 could be done in ~ 10 hours in 1992, in ~ 16 minutes in 1997 and in ~ 27 seconds in 2001! • High Performance Tools • portable • library calls • robust algorithms • help code optimization • More code development in less time • More simulation in less computer time A Report on the ACTS Toolkit

  26. Lessons Learned • There is still a gap between tool developers and application developers which leads to duplication of efforts. • The tools currently included in the ACTS Toolkit should be seen as dynamical configurable toolkits and should be grouped into toolkits upon user/application demand.   • Users demand long-term support of the tools. • Applications and users play an important role in making the tools mature. • Tools evolve or are superseded by other tools. • There is a demand for tool interoperability and more uniformity in the documentation and user interfaces. • There is a need for an intelligent and dynamic catalog/repository of high performance tools. A Report on the ACTS Toolkit

  27. User Community Engineering Mathematics Physics Numerical Simulations Biology Chemistry Challenge Codes Computing Systems Computer Sciences Collaboratories Medicine Bioinformatics Scientific Computing Centers Pool of Software Tools ACTS Workshops and Training Computer Vendors Testing and Acceptance Phase Interoperability

  28. Agenda, accomplishments, conferences, releases, etc Tool descriptions, installation details, examples, etc Goals and other relevant information Points of contact Search engine http://acts.nersc.gov

  29. Please mark your calendars!

  30. ACTS acts-support@nersc.gov http://acts.nersc.gov

More Related