160 likes | 412 Views
PinPlay : A Framework for Deterministic Replay and Reproducible Analysis of Parallel Programs. 2010.10.22 2010 Fall OSLab . Seminar. Rest of the Talk. Non-determinism PinPlay Overview Usage Examples Results Summary. Non-Determinism. Program execution is not repeatable across runs
E N D
PinPlay: A Framework for Deterministic Replay and Reproducible Analysis of Parallel Programs 2010.10.22 2010 Fall OSLab. Seminar
Rest of the Talk • Non-determinism • PinPlay Overview • Usage Examples • Results • Summary
Non-Determinism • Program execution is not repeatable across runs • Interactions with environment (single-threaded) • Shared-memory interleaving (multi-threaded) • Source of many problems • Hard to predict and test behaviors leads to bugs • Very hardand unpleasant to debug • Breaks program analysesthat rely on repeatability • Obstacle for adoption of parallel programming
Dealing with Non-Determinism • Eliminate it • Deterministic program execution enforced by runtime (e.g. constrained execution • Deterministic Replay • Let it be butcapture and reproduce execution if needed • Every instruction gets same input as in original run • This paper: User-level Deterministic Replay • Implementation, challenges and usage examples
Requirements • No OS or hardware changes • No changes in user environment • Manageable log sizes for long runs • Reasonable run-time overhead • Multi-threaded and multi-processed applications • Integration with other existing analysis tools (e.g. Dynamic analyzers, debuggers, profilers) • No assumptions about synchronization APIs
PinPlay User-level deterministic replay and analysis Logs (pinballs) Binary + Input PinPlay Normal Program Output + capture OS (Linux® or Windows®) • Run in application’s native environment • Replays user code • OS independent: cross-OS replay! • Easily integrates with other tools and debuggers Analysis Tools Logs (pinballs) + PinPlay replay Debuggers OS (Linux® or Windows®)
Replay Models • Parallel-capture and parallel-replay T0 T2 T1 T0 T2 T1 T0 T2 T1 Logs (pinballs) PinPlay PinPlay • Parallel-capture and isolated-replay T0 PinPlay Logs (pinballs) Logs (pinballs) PinPlay PinPlay T1 Logs (pinballs) PinPlay T2
Information Captured For Replay All memory Values • Code executed (user and libraries) • Position of code and stack • Output of some instructions (e.g. RDTSC) • Subset of shared-memory access interleaving • Subset of Memory Values • Shadow-memory to capture first reads without prior writes and OS side-effects automatically • Values changed by remote threads • Initial registers and OS register side-effects: • Signals/Exceptions/APCs/system calls Reads without prior writes OS side-effects used by app Values from remote threads All other values (not captured)
PinPlay Architecture User Land Application code and data Capable of logging, replaying and reloggingexecution (recapture from a replaying run) pinball Your Pin-based Tool PinPlay Lib Replayer Logger Instrumentation and analysis to capture logs Instrumentation and analysis to inject side-effects Intel’s Pin (JIT compiler and instrumentor) * OS (Linux® or Windows®) * http://www.pintool.org/
Cross-OS Replay and Challenges • Log on one OS and replay on another • System call translations • Most OS activity does not happen on replay (only side-effects restored) • Semantics is translated across OSes (e.g. create thread) • Memory mapping • Problem: address space different across OSes • Solution: use Pin’s Fetch API to redirect code and memory operand rewriting to redirect data Remap code code code address space on Windows® address space on Linux® Remap data data data
Usage Example: Program Analysis • Sampling and checkpointing for simulation • One run for profiling and finding representative regions, another for checkpointing • Requirement: both runs must be identical Logs (pinballs) PinPlay + Profiler Logs (pinballs) PinPlay Per-Process pinball Multi-process MPI program Per-Process pinball Checkpoints for simulation PinPlay + Checkpointer Representative Regions • Pinballs are used to share workloads for Pin-based analyses among architects
Usage Example: Replay for Debugging • Capture a buggy run and replay under debugger • Guaranteed to reproduce the bug and helps root causing • Works with off-the-shelf unmodified debuggers (e.g. GDB) • PinPlay based tool extends GDB commands with your own • Limitation: debugger can’t change control-flow • Used to debug various multi-threaded applications • Also using it for in-house debugging of concurrency issues with a major database vendor PinPlay Enabled Debugger Tool Logs (pinballs) GDB (unmodified) Binary remote protocol Intel’s Pin
Sources of Slowdown • Instrumentation of every memory operation to identify system call side-effects and log data • Could be done by OS at the cost of OS modification or OS-specific analysis (doesn’t work on Windows®) • Locks for shadow-memory accesses • Could be eliminated by using a shadow-copy per thread at the cost of significant increase in log sizes
Summary • User-level deterministic capture and replay • No OS changes, special hardware, or virtualization • Integrates w/ other Pin-tools for repeatable analysis and debugging • Replay occurs on any machine and works across OSes (Windows to Linux) • Pinballs are OS-independent and self-contained • Ideal for sharing workloads among researchers, for Pin-based analyses
Applying to us… • monitoring • thread operations • thread start/end • synchronization operation • lock acquire/release • read/write access to shared data • instruction level memory read/write