250 likes | 338 Views
Profiling with Xprofiler on Blue Horizon. Yifeng Cui and Laura C. Carrington yfcui@sdsc.edu San Diego Supercomputing Center. Overview. Introduction How to compile for Xprofiler How to run xprofiler on Blue Horizon Using Xprofiler Available Documentation. Introduction: Profiling Tools.
E N D
Profiling with Xprofiler on Blue Horizon Yifeng Cui and Laura C. Carrington yfcui@sdsc.edu San Diego Supercomputing Center
Overview • Introduction • How to compile for Xprofiler • How to run xprofiler on Blue Horizon • Using Xprofiler • Available Documentation
Introduction: Profiling Tools • prof – displays call graph profile data, provides percentage of CPU time used by the procedure, Unix tool • gprof – contains all info by prof, plus the timing information of the calling tree for the procedure, Unix tool • tprof – uses AIX trace facility to interrupt your program at each tick of the CPU clock and construct a trace table, not part of basic AIX • Xprofiler – access to profiled data with graphical user interface, AIX tool • PE Benchmarker – part of AIX 5L
Introduction: Profilers instrumentation • Profiling tools all depend on the injection of timing statement throughout the code by compiler (via the –pg flag) • The timing routines will slow down your code, and will affect the proportion of time that your code spends in various functions and loops • Profilers instrumentation can slow your code significantly, perhaps 50% or more
Introduction: Xprofiler • IBM Motif/Xwindow style Performance Profiler • Graphical interface for gprof profiles of parallel applications • For both sequential and parallel programs • Part of parallel environment • Compile and link as for gprof
Introduction: Xprofiler • Xprofiler provides • graphical function call tree display and textual profile reports to understand the CPU usage and function call counts information • a summary of the activities of all threads • Xprofiler doesn’t provide • other kinds of information such as CPU Idle, I/O, communication • the information about the specific threads in a multi-threaded program
Introduction: Xprofiler • Xprofiler’s graphical displays include: • Summary or average graphs • 2D or 3D graphs • Filtered graphs, including/excluding: • By function names • By CPU time • By call counts • …
Introduction: Xprofiler • Xprofiler’s textual reports include: • Flat profile • Call graph profile • Function index • Function call summary • Library statistics
How to Compile for Xprofiler Just add “-g -pg” flag to compiler: mpxlf90 –g –pg stf_01.f mpcc –g–pg stc_01.c • “-pg” produces executable with profiling tags built-in • “-g” will give breakdown of ticks per line of code
How To Run Xprofiler • Run the code the way you normally do, either interactive or batch. • After the execution completes one “gmon.out” file will be generated for each processor involved in the execution. • Launch xprofiler %xprofiler ./a.out gmon.out.* or %xprofilerthen specify them from within the GUI
Using Xprofiler Start Xprofiler: xprofiler a.out gmon.out* “View” menu allows you zoom in&out “Filter” menu allow you to filter main view “Report” menu gives you reports on the performance of different routines in the code
Using Xprofiler: Loading Files The load files dialog window lets you to specify the application’s executable, and the corresponding profile data file (gmon.out)
Using Xprofiler: Filtering Main View Use filter to just show executable tree. Functions are represented by green solid-filled boxes, size of box is related to cpu time spent in routine (as summary mode) Height:: amount of CPU time it spent on executing itself Width: amount of CPU time on executing itself, plus its descendent functions
Using Xprofiler: Uncluster function To see more detail, uncluster the function boxes Blue Arrow: represent the call made between each of the functions, the arrowhead indicates the direction of the call
Using Xprofiler: Searching for a Routine To find a particular Routine use “Utility” Menu to search tree Routine will now change color so that you can locate it
Using Xprofiler: Hidden Menu • Function menu • Perform a number of operations for any of the functions shown in the function call tree, you can access the statistical data, look at the source code, and control with functions get displayed • Arc menu • Locate the caller and callee functions for a particular call arc, a call arc is the representation of a call between two functions within the function call tree • Cluster node menu • Control the way your libraries are displayed
Using Xprofiler: Viewing the Flat Profile Use “Report” menu To display “Flat Profile” which show total execution times and call counts. Number of times function is called Percent CPU usage for this function Running sum of the number of seconds used by this function and those listed above it. The number of seconds used by this function alone. The average number of ms spent in function per call The average number of ms spent in function and its decedents per call Data presents in Flat Profile is the same data generated with gprof
Using Xprofiler: Call Graph Profile Use “Report” menu to display “Call Graph Profile”which shows % of total CPU usage that each function and its descendants consumed. Name of the function and its index number Each function has an associated index number which serves as the function's identifier. Percentage of the total CPU usage of the program used by this function and its descendants. Number of seconds this function spends on itself. Number of seconds spent on the descendants of this function, on behalf of this function. Depending on whether the function is a parent, child, or the function of interest (the function whose index is listed in the index field of this row), this value can stand for one of the following: Number of times a parent called the function of interest, Number of times the function of interest called itself recursively, and Number of times the function of interest called a child
Using Xprofiler: Function Report Use the Right Mouse Button over a function to select “Statistics Report” for that function. The Statistics Report give information on CPU usage, call counts, time per call, and other statistics about the function.
Using Xprofiler: Viewing Source Code Display Source Code Use the Right Mouse Button over a function to select “Show Source Code” for that function. Or via report menu -> flat profile -> code display The Source Code display will give a rough estimate of clock ticks spent per line of source code if executable was compiled with –g –pg
Using Xprofiler: Tips • Ideally, put gmon.out, executable, source in the same directory • Libraries must match across systems • Use gmon.out.* to get complete picture • Uncluster functions to obtain a clear overview pf the call tree • If xprofiler fails with “bad font” error message, then: include following line in your $HOME/.Xdefaults:note: make sure the fonts are scalable with the format -*-0-0- *narc*font: -*-roman-medium-r-normal-*-0-0-100-100-*-*-*-* then load: Xrdb –load $HOME/.Xdefaults
Documentation NPACI Blue Horizon documentation http://www.npaci.edu/BlueHorizon IBM Parallel Enviroment for AIX: Operation and Use vol.2 (see chapter 4: Profiling parallel programs with Xprofiler) http://www.rs6000.ibm.com/doc_link/en_US/a_doc_lib/sp32/pe/html/d3d31mst.html IBM Parallel Environment for AIX (PE) http://www.rs6000.ibm.com/support/sp/resctr/public/pebooks.html#hhguide
Lab Session for XprofilerEnvironment Setup Setup for running X-windows applications on PCs: 1. Login to b80login.sdsc.edu using CRT (located in Applications common) for interactive jobs, or horizon.npaci.edu for batch jobs. 2. Launch Exceed (located in either Applications common or as a shortcut on your desktop called "Humming Bird". 3. set your environment, for csh: setenv DISPLAY t-wolf.sdsc.edu:0.0 ****where "t-wolf" is the name of the PC you are using 4. copy files from Tools_examples directory into your own working space. * create a directory to work with TotalView and Xprofiler: mkdir Tools * change directories into new directory: cd Tools * copy files into new directory: cp /work/Training/Tools_examples/* . NOTE: On a 2-button mouse the center mouse button is done by clicking on both the right and left button together.
Lab Session for XprofilerRunning Xprofiler 1. Compile either Fortran or C example (st_01) with the following: mpxlf90 -g -pg stf_01.f mpcc -g -pg -lm stc_01.c 2. Run executable either interactive or by batch interactive command on b80 nodes: poe a.out -nodes 1 -tasks_per_node 2 -rmpool 1 –euilib ip –euidevice en0 or run Xprofiler_runme batch on regular nodes: llsubmit llscript <--first edit llscript for your values 3. Launch Xprofiler: xprofiler a.out gmon.out* 4. Explore Xprofiler and the codes performance by changing the view (Filter menu), looking at the different profiles (Report menu), etc.