Using Dyninst to Dynamically Control Memory Reference Tracing

Using Dyninst to Dynamically Control Memory Reference Tracing Jeff Odom

Sigma Goals • Collaboration between IBM and UMD • Family of tools to understand caches • Focus of detailed statistics • Complement existing hardware counters • Ability to handle real applications • MPI and OpenMP programs • Fortran and C • Provide hints about restructuring • Padding (both inter and intra data structures) • Blocking

Approach • Run instrumented program • Capture full information about memory use • Produce compact trace • Extracts loops and memory strides • Post execution tools • Detailed simulator • Full discrete event simulator • Memory profiler • share of accesses due to each data structure • Cache Prediction Tool • Predict cache misses using symbolic equations

RPT BLK1 ADR ADR ADR BLK2 ADR ADR BLK3 250 100 200 300 300 500 7 4 4 4 4 4 Representing Program Execution • Capture full execution behavior • Record all basic blocks and memory addresses • Produces large traces (due to looping) • Trace compression • Maintain pattern buffer • Scan for repeating patterns • Extract memory strides • Repeat algorithms for nested loops Base Count Length Stride

Not Enough • A few seconds generates gigabytes • Regularity of data critical to compression • Lossy tracing • Statistically “rebuild” trace from sampled set

Sampling • Leverages Sigma • Most scientific apps loop based • Regular data access gives better compresion • Time step boundary • Outermost loop • Non-uniform memory access OK

Sigma + Dyninst • Dyninst natural choice • Vary sample rate without recompilation • Adaptive/progressive rate during execution • Leverage existing Sigma infrastructure • Only generate trace • Offline simulation step unchanged

DynSigma • Mutator parses executable, inserts instrumentation, generates aux files • Instructions/module • Stack/global variables • Functions/line # • Group points by basic block (NEW) • Find load/store instrumentation viaBPatch_basicBlock::findPoint() • Mutatee generates trace • Inserted Sigma library

Sample Application • Seismic simulation from SPEC-HPC 2002 • Models multiple seismic processes • Process results pipelined • Variable time steps • Different data pattern for each process • C & Fortran • Fortran – data processing • C – dynamic memory management, IO

L1 cache memtime by data structure

L2 cache memtime by data structure

L1 + L2 memtime by data structure

Why go to all the trouble? • How about just one time step?

Size does matter • Includes 0:12 mutator overhead

Conclusions • Compressed traces may be very large for short runtimes • Sampling single time step no good • Concentrate on main processing loop • Small (1%) samples accurate enough

Ongoing & Future Work • Measure another application • Determining time steps at runtime • Extending code coverage with counters • Adaptive sampling rates • Multi-pass memory profiling • Irregular accesses • Sampling • Multithreaded applications

Using Dyninst to Dynamically Control Memory Reference Tracing

Using Dyninst to Dynamically Control Memory Reference Tracing

Presentation Transcript

Using a Reference Manager

Status of Krell Tools Built using Dyninst/MRNet

Using Reference Sources

Detecting Code Reuse Attacks Using Dyninst Components

Using Dyninst for Program Binary Analysis and Instrumentation

unstrip : Restoring Function Information to Stripped Binaries Using Dyninst

Ray tracing using Volumetric Planes

Using Colocation to Support Human Memory

Dynamic Emulation and Fault-Injection using Dyninst

Dynamically generated pages using database-to-web technologies

Floating Point Analysis Using Dyninst

Using Dyninst to Dynamically Control Memory Reference Tracing

How to Reference Using APA

Using Dyninst to Measure Floating-point Error

IPL Reference Using QRC

CSc 352 Freeing Dynamically Allocated Memory

Dynamically Allocated Memory

Using Dyninst for Program Binary Analysis and Instrumentation