210 likes | 339 Views
TAU Performance System. Performance Tools BOF, SC’07 5:30pm – 7pm, Tuesday, A9 Sameer S. Shende sameer@cs.uoregon.edu http://tau.uoregon.edu Performance Research Laboratory University of Oregon. Acknowledgements. Dr. Allen D. Malony, Professor Alan Morris, Senior software engineer
E N D
TAU Performance System Performance Tools BOF, SC’07 5:30pm – 7pm, Tuesday, A9 Sameer S. Shende sameer@cs.uoregon.edu http://tau.uoregon.edu Performance Research Laboratory University of Oregon
Acknowledgements • Dr. Allen D. Malony, Professor • Alan Morris, Senior software engineer • Wyatt Spear, Software engineer • Scott Biersdorff, Software engineer • Matt Sottile, Research faculty • Rob Yelle, Research faculty • Kevin Huck, Ph.D. student • Aroon Nataraj, Ph.D. student • Shangkar Myangalam, Ph.D. student • Brad Davidson, Systems administrator
TAU Parallel Performance System • http://tau.uoregon.edu/ • Multi-level performance instrumentation • Multi-language automatic source instrumentation • Flexible and configurable performance measurement • Widely-ported parallel performance profiling system • Computer system architectures and operating systems • Different programming languages and compilers • Support for multiple parallel programming paradigms • Multi-threading, message passing, mixed-mode, hybrid
What is TAU? • Portable, profiling and tracing toolkit • BSD style license • Automatic source level instrumentation (PDT, Opari [FZJ]) • Routine, and loop level • Instrumentation optimization (TAU_THROTTLE) • Measurement • Callpath, phase, parameter based profiling • PAPI [UTK] support in profiling and tracing • One or more native, preset events • Analysis tools • Parallel profile analysis (ParaProf) • Performance data management (PerfDMF database) • Performance data mining (PerfExplorer) • Kernel monitoring and KTAU
What is new in TAU and PDT? • TAU v 2.17 and PDT v 3.12 released at SC’07 [tau.uoregon.edu] • Support for new platforms • IBM BG/P (Linux) • SiCortex • Cray XT (Compute Node Linux) • Windows Cluster 2003 • Improved support for VampirTrace [TU Dresden] for atomic events, native OTF generation • Automatic wrapper library generation (tau_wrap) for HDF5, I/O… • Enhanced Eclipse/PTP plugin for tool interoperability • PerfExplorer: Custom charts, multiple database support • ParaProf supports multiple profile formats, databases • PERIXML, TAU, perfsuite, cube 2 & 3, mpiP, HPMtoolkit, gprof… • Support for storing metadata in profiles, TAU portal, PerfDMF • PDT now supports type information in PDB files GFortran parser
TAU Demos at SC’07 • ASC/NNSA Booth #1617, demo station #2 • Wednesday (11/14):12-1pm, 3-5pm • Thursday (11/15):10-11am • Schedule available at tau.uoregon.edu • SiCortex Booth • ANL Booth (KTAU presentation) • Thu. 12-1pm • Paper: • “Ghost in the machine: Observing the Effects of Kernel Operation in Parallel Application Performance” A. Nataraj, A. Morris, A. Malony, M. Sottile, P. Beckman, SC’07 A2/A5 Wed. 10:30am
Future Research Directions • Improving tool interoperability • OTF [TU Dresden] • TotalView [Totalview Tech] • Scalasca/KOJAK instrumentation [FZJ] • Hybrid sampling, instrumentation based measurements • PerfSuite [NCSA] • Kernel measurements for tracking I/O using KTAU and ZeptoOS [ANL] • Binary rewriter integration using DyninstAPI [U. Maryland, U. Wisconsin] • Improvements in SiCortex integrated tool environment • TAU Portal, regression testing
Program Database Toolkit (PDT) Application / Library C / C++ parser Fortran parser F77/90/95 Program documentation PDBhtml Application component glue IL IL SILOON C / C++ IL analyzer Fortran IL analyzer C++ / F90/95 interoperability CHASM Program Database Files Automatic source instrumentation TAU_instr DUCTAPE
Using TAU: A brief Introduction • To instrument source code using PDT • Choose an appropriate TAU stub makefile in <arch>/lib: % setenv TAU_MAKEFILE /usr/tau-2.17/x86_64/lib/Makefile.tau-mpi-pdt-pgi % setenv TAU_OPTIONS ‘-optVerbose …’ (see tau_compiler.sh) And use tau_f90.sh, tau_cxx.sh or tau_cc.sh as Fortran, C++ or C compilers: % mpif90 foo.f90 changes to % tau_f90.sh foo.f90 • Execute application and analyze performance data: % pprof (for text based profile display) % paraprof (for GUI)
TAU’s ParaProf Profile Browser: Manager Application Metadata Multiple PerfDMF databases
TAU’s ParaProf Scalable Profile BrowserS3D: 6400 cores on XT3+XT4 System (Jaguar) • Gap represents XT3 nodes
S3D Scatter Plot: Visualizing Hybrid XT3+XT4 • Red nodes are XT4, blue are XT3 6400 cores
K. Huck, A. Malony, R. Bell, A. Morris, “Design and Implementation of a Parallel Performance Data Management Framework,” ICPP 2005. PerfDMF Architecture
PerfExplorer: S3D Total Runtime Breakdown WRITE_SAVEFILE MPI_Wait 12,000 cores!
Support Acknowledgements • US Department of Energy (DOE) • Office of Science • MICS, Argonne National Lab • ASC/NNSA • University of Utah ASC/NNSA Level 1 • ASC/NNSA, Lawrence Livermore National Lab • US Department of Defense (DoD) • NSF HEC-RTF and SDCI • Research Centre Juelich • TU Dresden • Los Alamos National Laboratory • ParaTools, Inc. • PSC, NCSA, and U. Oregon