230 likes | 457 Views
TAU Performance Tool on Ranger and Kraken. Mahin Mahmoodi October 22, 2009. Outline. TAU’s features Instructions using TAU on Ranger & Kraken TAU overhead TAU MD code profiling Callpath Outer Loop Selective Phase Tracing References. TAU: Tuning & Analysis Utilities.
E N D
TAU Performance Tool on Ranger and Kraken Mahin Mahmoodi October 22, 2009
Outline • TAU’s features • Instructions using TAU on Ranger & Kraken • TAU overhead • TAU MD code profiling • Callpath • Outer Loop • Selective • Phase • Tracing • References
TAU: Tuning & Analysis Utilities • Supports essentially all computing platforms • Auto and manual instrumentation for profiling , tracing, and sampling • Routine, loop, block, structure, phase profiling, compiler and binary instrumentation • Multiple programming language & programming paradigms support • Fortran, Java, C/C++, Python, .., Multi-threading, message passing, mixed-mode, hybrid, … • Performance measurement of I/O & Linux kernel • Track heap memory for each routine • Application to OS noise analysis • Tools for performance data management & mining • Sophisticated visualization (2-D & 3-D views) • Translation to multiple trace formats • Developed by Allen Malony et al. at the University of Oregon and ParaTools • TAU Website: http://tau.uoregon.edu
Location of TAU on Ranger & Kraken Ranger % module avail tau tau/2.17(default) - Latest TAU version is available from: /share/home/00968/tg802155/tau-2.18.2p4 • Configured with pgi7_2/mvapich/1.0.1, and with intel10_1/mvapich/1.0.1 Kraken % module avail tau tau-2.18.1, tau-2.18.2 - Configured with PGI compiler
General Instructions for TAU Step 1: Auto Instrumentation • Use a TAU Makefile stub • Compile with TAU scripts (tau_cc.sh, tau_f90.sh, tau_cxx.sh) Example - setenv TAU_MAKEFILE <TAU_path>/lib/Makefile.tau-papi-pdt-pgi - setenv TAU_OPTIONS "-optVerbose -optKeepFiles“ (optional) - tau_cc.sh -o hello hello.c (hello is an instrumented binary) If using makefile - make CC=tau_cc.sh
General Instructions for TAU (cont.) Step 2: Execution • module load papi • Set the necessary environment variables and run the instrumented binary as normal (this step generates profile files, one per core) • information on environment variables is here: http://www.cs.uoregon.edu/research/tau/docs/newguide/bk03apa.html#d0e15219 Step 3: Profiling report • Add TAU bin directory to your path set path=( <tau_path>/bin $path) • Run pprof (text) or paraprof (GUI) to get results To view on Windows workstation: - % paraprof --pack app.ppk (pack profiles on remote machine) - click on app.ppk in workstation (tau has to be installed first)
The UNRES The UNRES molecular dynamics (MD) code utilizes a carefully-derived mesoscopic protein force field to study and predict protein folding pathways by means of molecular dynamics simulations. • http://www.chem.cornell.edu/has5 • http://www.chem.cornell.edu/has5
% paraprof – Main Data Window UNRES on Ranger, 16way 32 nodes Left_click here Subroutines time breakdown in each cores: Core 0 Core 1
TAU Instrumentation Overhead • TAU Direct measurement • Deterministic approach • Auto and manual instrumentation • profilers do not give detailed insight into timing behavior of an application • Introduces overhead Overhead (time in sec): MD steps base: 51.4 seconds MD steps with TAU: 315 seconds
Reducing the TAU Instrumentation Overhead • - In the Main Data Window, • from File, select • Create Selective Instrumentation File • Specify the filtering criteria in selection window • Save the throttled routines as a file • Include the throttled file in the compilation (more info later)
TAU Commonly Used Features • Callpath profiling • Selective instrumentation • Loop Instrumentation • Phase profiling • Tracing
TAU CALLPATH Profiling In run-time: • setenv TAU_CALLPATH 1 • setenv TAU_CALLPATH_DEPTH 30 (default depth is 2) • To see the call graph: (On Main Data Window, right_click on cores then select Thread Statistical Table option)
Selective Instrumentation • Instrument the code normally • Generate the select.tau file as shown in slide … • Set TAU_OPTIONS and recompile: setenv TAU_OPTIONS “-optVerbose -optKeepFiles -optPreProcess-optTauSelectFile=select .tau” • Further selective options can be added to select.tau • Files to include/exclude • Routines to include/exclude • Directives for loop instrumentation • Phase definitions
Sample select.tau % cat select.tau BEGIN_EXCLUDE_LIST DDOT DAXPY VECPR DIST BETA ALPHA END_EXCLUDE_LIST BEGIN_INSTRUMENT_SECTION static phase name="PHASE_MD" file="minimize_p*" line=153 to line=154 loops file="prim_advance_mod*" routine="PRIM_ADVANCE_MOD::PREQ_ADVANCE_EXP" END_INSTRUMENT_SECTION BEGIN_INCLUDE_LIST EELEC EGB GINV_MULT ESCP END_INCLUDE_LIST
Load imbalance is ResolvedChoice of the serial algorithm created load imbalance Load imbalance in Original code
UNRES Start-up Time Is Improved Original code MPI_Bcast time is reduced by ~4x optimizing the start up routines(262.50 vs 54.01 sec) Optimized code
Tracing • Captures run-time events • Timestamp, process, thread, and event type are recorded • Enter/leave of functions for process/thread • MPI sender, receiver, length, tag, communicator • Tracing preserve the context • temporal and spatial relationships • Traces can become very large • May cause perturbation
TAU Tracing and Vampir Visualization To generate TAU trace files • Instrument the code with TAU normally • setenv TAU_TRACE 1 • Run normally to generate *.trc and *.edf files • % tau_treemerge.pl to merge tau.trc and tau.edf files • tau2otf tau.trc tau.edf app.otf Trace visualization and analysis • % vampir app.otf (or vng client with vngd server) • Vampir is available on bigred and quarry at IU.
UNRES TRACE in Timeline View • Intuitive navigation and zooming help to quickly identify inefficient or faulty parts of a code MPI Messages Thumbnail
EGB calculation in processes 9 – 31 stars later than processes 0 - 8
References • TAU • http://tau.uoregon.edu • http://www.cs.uoregon.edu/research/tau/docs.php • http://www.cs.uoregon.edu/research/tau/docs/newguide/re01.html • http://www.cs.uoregon.edu/research/tau/docs/newguide/bk03apa.html#d0e15219 • http://www.cs.uoregon.edu/research/tau/docs/scenario/index.html • POINT: Productivity from Open, Integrated Tools • http://www.nic.uoregon.edu/point • http://www.psc.edu/general/software/packages/tau/TAU-quickref.pdf • IU’s vampir documentation: • http://www.pti.iu.edu/hpa/vampir-workshop