Paradyn

Paradyn

Paradyn Goals • Performance measurement tool that • scales to long-running programs on large parallel and distributed systems • automates much of the search for performance bottlenecks • avoids space and time overhead of trace-based tools

Paradyn Approach • Dynamically instrument application • Automatically control instrumentation in search of performance problems • Look for high level problems (e.g., too much synchronization blocking, I/O blocking, or memory delays) using small amount of instrumentation • Once general problem is found, selectively insert more instrumentation to find specific causes

Paradyn Components • Front end and user interface that allow user to • display performance visualization • use the Performance Consultant to find bottlenecks • start and stop the application • monitor status of the application • Paradyn daemons • monitor and instrument application processes • pvmd, mpid, winntd

Using Paradyn • Program preparation: • Future releases will be able to instrument unmodified binary files • Current release 2.0 requires linking applications with Paradyn instrumentation libraries • Static linking is required on IBM AIX platforms • Application must be compiled with -g flag

Paradyn Run-time Analysis • Paradyn is designed to either start up application processes and kill them upon exit, or to attach to and detach from running (or stopped) processes. • Attaching to a running process is currently implemented on Solaris. • Paradyn currently does not detach but only kills upon exit.

Metric-Focus Pairs • Metric-focus grid based on two vectors • list of performance metrics (e.g., CPU time, blocking time, message rates, I/O rates) • list of program components (e.g., procedures, processes, message channels, barrier instances) • Cross product forms matrix from which user selects metric-focus pairs • Elements of matrix can be single-valued (e.g., current value, average, min, max) or time-histograms • Time-histogram is a fixed size data structure that records behavior of a metric over time

“Where” Axis • After loading program, Paradyn adds entries for program resources to Where Axis window • files • procedures • processes • machines

Multiple foci selection on Where Axis

Performance Visualizations • Before or while running a program, the user can define performance visualizations in terms of metric-focus pairs • select focus from Where Axis • select metrics from Metrics Dialog Box • select visualization from Start Visualization Menu

Metrics Dialog Box

Start Visualization Menu

Paradyn Phases • Contiguous time intervals within an application’s execution • Two kinds • global phase starts at beginning of program execution and extends to current time • local phases non-overlapping subintervals of the global phase

Paradyn Phases (cont.) • Data collection for new phase occurs at finer granularity than for global phase. • Visualizations can show data for either local phase or global phase. • Performance Consultant can simultaneously search both local phase and global phase.

Performance Consultant • Based on W3 Search Model • “Why” - type of performance problems • “Where” - where in the program these problems occur • “When” - time during execution during which problems occur

Performance Consultant (cont.) • Automatically locates potential bottlenecks in your application • Contains definitions of a set of performance problems in terms of hypotheses - e.g., PerfMetricX > Specified Threshold • Continually selects and refines which performance metrics are enabled and for which foci • Reports bottlenecks that exist for significant portion of phase being measured

Why Axis TopLevelHypothesis ExcessiveSyncWaitingTime CPUBound ExcessiveIOBlockingTime TooManySmallIOOps

Why Axis (cont.) • CPUBound: Compares CPU time to the tunable constant PC_CPUThreshold • ExcessiveSyncTime: Compares total synchronization waiting time to the tunable constant PC_SyncThreshold • ExcessiveIOBlockingTime: Compares total I/O waiting time to the tunable constant PC_IOThreshold • TooManySmallIOOps: Compares average number of bytes per I/O operation to PC_IOThreshold

Search History Graph • DAG with (hypothesis : focus) pairs as nodes • Top node represents (TopLevelHypothesis : WholeProgram) • Child nodes represent possible refinements • Search is expanded dantime a (hypothesis : focus) pair tests true

Search History Graph (cont.) • Node status given by color • green background indicates Unknown status • white foreground indicates active test • pink background indicates hypothesis tested false • blue background indicates hypothesis tested true • yellow line represents Why Axis refinement • purple line represents Where Axis refinement

Search History Graph as search begins

Refinement to CPUbound hypothesis

Further refinement of CPUbound hypothesis

Two searches in progress

Final refinement

Tunable Constants • PC_CPUThreshold: used for hypothesis CPUBound • PC_SyncThreshold: used for hypothesis ExcessiveSyncWaitingTime • PC_IOThreshold: used for hypothesis ExcessiveIOBlockingTime • MinObservationTime: all tests will be continued for at least this interval of time before any conclusions are drawn. • costLimit: determines an upper bound on the total amount of instrumentation that can be active at a given time.

Visualization Modules (visi’s) • External processes that use VisiLib RPC interface to access performance data in real time • Visi’s provided with Paradyn • time-histogram • bar chart • table • 3-d terrain

Time Histogram with Actions and View menus expanded

Barchart Visualization

Table Visualization

3-d Histogram Visualization

Dyninst API • http://www.cs.umd.edu/~hollings/dyninstAPI • Machine-independent interface for runtime program instrumentation • Insertion and removal of instrumentation code into and from running processes • Process and OS independent specification of instrumentation code • C++ library interface • Can be used to build debuggers, performance measurement tools, simulators, and computation steering systems

Dyninst API (cont.) • Currently supported platforms • SPARC SunOS and Solaris • x86 Solaris and NT • IBM AIX/SP • DEC Alpha • Planned for near future • SGI Origin 2000

Dyninst Terminology • point - location in a program where instrumentation can be inserted • snippet - representation of a bit of executable code to be inserted into a program at a point • e.g., To record number of times a procedure is invoked: • point - first instruction in the procedure • snippet - statement to increment a counter

Dyninst Terminology (cont.) • thread - thread of execution, which may be a normal process or a lightweight thread • image - static representation of a program on disk • application - process being modified • mutator - program that uses the API to modify the application

Using the dyninst API • Declare single object of class Bpatch • Identify application process to be modified • appThread = bpatch.createProcess(pathname, argv); • appThread = bpatch.attachProcess(pathname, processId) • Define snippet and points where it should be inserted

Dyninst Example Bpatch_image *appImage; Bpatch_Vector(Bpatch_point*) *points; // Open the program image associated with the thread and return a handle to it. appImage = appThread->getImage(); // find and return the entry point to the “InterestingProcedure”. Points = appImage->findProcedurePoint(“InterestingProcedure”, Bpatch_entry); // create a counter variable (but first get a handle to the correct type). Bpatch_variableExpr *intCounter = appThread->malloc(*appImage->findType(“int”)); // create a code block to increment the integer by one. // intCounter = intCounter + 1 // Bpatch_arithExpr addone(Bpatch_assign, *intCounter, Bpath_arithExpr(Bpatch_plus, *intCounter, Bpatch_constExpr(1))); // insert the snippet of code into the application. appThread->insertBlock(addone, *points);

DAIS • Dynamic Application Instrumentation System • Proposed by Douglas Pase at IBM • Platform-independent client-server library for building debugging and performance tools • Based on dyninst

DAIS (cont.) • Support proposed for • code patches • periodic instrumentation • inferior remote procedure calls (IRPCs) • remote memory reads and writes • dynamic subroutine placement • process control for debugging • Planned demo tools • dynamic printf • trace capture for MPI

Paradyn

Paradyn

Presentation Transcript

Paradyn Week 2011 - Monday

Condor and MPI Paradyn/Condor Week Madison, WI 2001

Paradyn as a Strict Dyninst Client

MPI Scheduling in Condor: An Update Paradyn/Condor Week Madison, WI 2002

Paradyn & Dyninst Demos

Middleware Activities from the Paradyn Project

Dynamic Instrumentation of Loops in Paradyn & Dyninst

The AMD64/EM64T Port of Dyninst and Paradyn

Condor Administration Paradyn-Condor Week UW Campus March 2002

Paradyn/Condor Week 2001

The AMD64/EM64T Port of Dyninst and Paradyn

Paradyn / Condor Week 2007 University of Wisconsin-Madison

New Features in Paradyn and Dyninst

Paradyn

Paradyn

Presentation Transcript

Paradyn Week 2011 - Monday

Condor and MPI Paradyn/Condor Week Madison, WI 2001

Paradyn as a Strict Dyninst Client

MPI Scheduling in Condor: An Update Paradyn/Condor Week Madison, WI 2002

Paradyn &amp; Dyninst Demos

Middleware Activities from the Paradyn Project

Dynamic Instrumentation of Loops in Paradyn &amp; Dyninst

The AMD64/EM64T Port of Dyninst and Paradyn

Condor Administration Paradyn-Condor Week UW Campus March 2002

Paradyn/Condor Week 2001

The AMD64/EM64T Port of Dyninst and Paradyn

Paradyn / Condor Week 2007 University of Wisconsin-Madison

New Features in Paradyn and Dyninst

Paradyn & Dyninst Demos

Dynamic Instrumentation of Loops in Paradyn & Dyninst