250 likes | 267 Views
Explore a system for comprehensive code instrumentation without traps, minimizing jumps. Enhance efficiency by relocating functions and utilizing multitramps for whole-program coverage.
E N D
Generalized Code Relocation for Instrumentation and Efficiency Andrew R. Bernat University of Wisconsin bernat@cs.wisc.edu Generalized Code Relocation
Design Objectives • Whole-program instrumentation • Instrument every instruction in the program • … and all control flow edges as well • Efficient instrumentation • No traps! • Minimize extraneous jumps • Restrict register save/restores • Flexible, extensible instrumentation system • Laying the groundwork for binary rewriting Generalized Code Relocation
Multitramps • Whole-program instrumentation • All instructions, including neighbors • All control flow edges • One trampoline per basic block • Reduces number of extra branches • Hierarchical code generation • Extensible • Allows for a variety of optimizations Generalized Code Relocation
Function Relocation • Efficient instrumentation • Blocks too small for branch to instrumentation • Instrumentation too far away • No traps! • Shared functions • Copy to remove sharing • Function rewriting • Undo optimizations Generalized Code Relocation
Instrumentation Code Instrumentation Code Old Instrumentation Overview ApplicationProgram Base Trampoline Mini Trampolines Function foo Save Regs Restore Regs instr1instr2instr3 instr2 Save Regs Restore Regs Generalized Code Relocation
instr1 instr2 Old Instrumentation - Consecutive ApplicationProgram Multiple Base Trampolines Mini Trampolines Function foo instr1instr2instr3 Generalized Code Relocation
Instrumentation Code Instrumentation Code Old Instrumentation – Uninstrumentable Neighbors ApplicationProgram Base Trampoline Mini Trampolines instr1 Function foo Save Regs Restore Regs instr1instr2instr3 instr2 Save Regs Restore Regs instr3 Generalized Code Relocation
save/restore save/restore save/restore Edge instrumentation ApplicationProgram ‘Edge’ Trampoline Base Trampolines pre-branch Function foo branch fallthrough branch jump taken Instrument edges via another level of indirection (plus extra branches) Generalized Code Relocation
Limitations of Old Instrumentation • Incomplete instrumentation coverage • Often could not instrument “near-by” instructions • Inefficient instrumentation • Edges, consecutive instructions require extra branches • Platform specific implementation • Inextensible and bug-prone Generalized Code Relocation
Multitramp Principles • Basic-block instrumentation • One jump to/from per block • Efficient instrumentation of neighbor instructions • Logical view: a control flow graph • Relocated instructions + instrumentation • Apply compiler techniques to dynamic instrumentation Generalized Code Relocation
Base Tramp Instruction Instruction Base Tramp Branch Fallthrough Target Multitramps ApplicationProgram Multitramp Function foo Basic Block Generalized Code Relocation
Multitramp Implementation • A multitramp is a tree of code objects • Code objects provide the following: • Maximum space required (worst case) • Generate, install, and link callbacks • Map relocated to original address • Single mechanism for both instruction and edge instrumentation Generalized Code Relocation
Multitramp Example save ; BT 1 branch <MT 1> restore ; BT 1 <relocated instr> branch <BT 3> save ; BT 2 branch <MT 3> restore ; BT 2 return save ; BT 3 branch <MT 4> restore ; BT 3 return Mini Tramp 1 Base Tramp 1 Mini Tramp 2 Instruction Branch Base Tramp 2 Mini Tramp 3 Base Tramp 3 Mini Tramp 4 Generalized Code Relocation
In-Line Instrumentation • Current out-of-line model is based on the requirements of Paradyn • Frequent insertion/removal of instrumentation • Limited opportunity for optimization • Particularly register saves and restores • What about long-lived instrumentation? Generalized Code Relocation
In-Line Instrumentation • In-line instrumentation into a single code sequence: • Relocated instructions • Save/restore code • Instrumentation • Replace entire sequence when something changes! BPatch::setMergeTramp(true) Generalized Code Relocation
Multitramp Status • Extensible implementation • Can add new code objects to multitramp CFG: • Raw binary sections. • Control flow-altering code • In-line instrumentation • POWER, x86-64 • Platform-independent design • Encapsulated platform-dependent sections • Included with all platforms in Dyninst 5.0 Generalized Code Relocation
Multitramp Results • Whole-program instrumentation • Instrument every instruction in the program • … and all control flow edges as well • Efficient instrumentation • No traps! • Minimize extraneous jumps • Restrict register save/restores • Flexible, extensible instrumentation system • Laying the groundwork for binary rewriting Generalized Code Relocation
Function Relocation • The basic block may be too small to contain a branch to instrumentation • IA-32, x86-64 • We may not have the available registers to construct a long branch • POWER, SPARC • Solution: relocate on a function level • Sufficient space to fit large branches • Dead registers that can be used to branch Generalized Code Relocation
Old Approach • One-time relocation • Preemptively expand possible instrumentation sites: • Function entry, exit, call sites; loop entry, exits • But what about everything else? • Linear scan of the function, ignoring control flow. • Dangerous with in-lined data Generalized Code Relocation
Incremental Function Relocation • A function is a list of basic blocks • Accumulate modifications to each block • Ex: block must be 5 bytes long • Generate relocated versions on-the-fly • Only modify what is necessary • Add instrumentation to the new function Generalized Code Relocation
block 1 block 3 block 2 block 2 block 4 block 5 Function Relocation - Example block 1 Block 2 is too small to patch in a jump block 2 block 3 Add modification block 4 • Copy the function • Enlarge block 2 • Replace block 5 Generalized Code Relocation
Other Uses for Relocation • Overlapping functions • Relocation disambiguates code • Instrument unique per-function copy • Undo optimizations • Rewrite the function during relocation • Example: unwinding a tail call Generalized Code Relocation
Function Relocation Status • Platform-independent function relocation engine • IA-32, x86-64, POWER, SPARC • Support for multiple relocated versions • On-the-fly code relocation • Extensible modification interface • Block must be 5 bytes long • Modify the instructions in the block Generalized Code Relocation
Design Objectives • Whole-program instrumentation • Instrument every instruction in the program • … and all control flow edges as well • Efficient instrumentation • No traps! • Minimize extraneous jumps • Restrict register save/restores • Flexible, extensible instrumentation system • Laying the groundwork for binary rewriting Generalized Code Relocation
Conclusion • Multitramps • Whole-program instrumentation approach • Function relocation • Instrument everywhere (without traps) • People • Drew Bernat – Multitramps • Nate Rosenblum – Function relocation • Nick Rutar – Register optimizations Generalized Code Relocation