300 likes | 431 Views
Dimension: An Instrumentation Tool for Virtual Execution Environments. Jing Yang, Shukang Zhou and Mary Lou Soffa Department of Computer Science University of Virginia VEE 2006 June 16 th 2006 Ottawa, Canada. Motivation. Increasing usage of VEEs in many areas
E N D
Dimension: An Instrumentation Tool for Virtual Execution Environments Jing Yang, Shukang Zhou and Mary Lou Soffa Department of Computer Science University of Virginia VEE 2006 June 16th 2006 Ottawa, Canada
Motivation • Increasing usage of VEEs in many areas • Performance [Bala et al, PLDI ’00] • Security [Scott et al, ACSAC ’02] • Power consumption [Hazelwood et al, ISLPED ’04] • Increasing importance of instrumentation for VEEs • Requested by both developers and users • Challenge: Building instrumentation for VEEs • When to add instrumentation • Instrumentation is added when a VEE is built • Repetitive work, time-consuming • Only for some preplanned purposes • Instrumentation is added after a VEE is built • A standalone instrumentation system which can be used by different VEEs for different purposes – even harder
Translation-based VEEs • We focus on translation-based VEEs • Dynamically translate source binary to target binary • Target binary is stored in code cache for execution • Handle two binaries simultaneously • Input source binary • Dynamically generated target binary • Instrumentation for translation-based VEEs • Perform on both source binary and target binary • Belong to binary instrumentation
Dimension • Flexibility – plug and play • Minimum modification to a VEE to use Dimension • Minimum reconfiguration to Dimension (architecture, language) • Comprehensiveness • Be able to instrument both source binary and target binary • Instrumentation can be done at various levels of granularities • Easy-of-Use • Simple user specification for instrumentation • Efficiency • Reasonable instrumentation overhead
Relationship between VEEs and Instrumentation • Significant modification • Hard to reuse Application Application VEE VEE with instrumentation Instrumentation OS + Hardware (a) OS + Hardware (b) • Unnecessary translations • Unnecessary context-switches Application VEE Instrumentation OS + Hardware (c) • Easy to reuse • Lightweight modification • One translation and context-switch
Scenario Dimension IA-32 VEE Initialize MIPS Java IA-32 to MIPS Instrument Java Finalize Java Java C Stub Functions Library Binary-Editing Utility Library …… Java …… IA-32 MIPS
When to Add Instrumentation Application Dimension Instrument source binary via corresponding target binary VEE Initializer Instrumentation Unit Dispatcher Translator Translation Unit Code Cache Probe-based technique Finalizer Clear interfaces between Dimension and VEE
Probe-Based Technique for Variable-length ISA 01 add eax, ebx D8 JMP 01 add eax, ebx D8 Trampoline 29 sub eax, edx D8 Save Context Set Up Parameters 83 add, eax, 0x12 C0 12 Call Analysis Routines Restore Context …… 29 sub eax, edx D8 Analysis Routine Instrumentation Uint 83 add, eax, 0x12 C0 12 Save Context Set Up Parameters Call Analysis Routines Restore Context
Components and Interfaces void FinalizeDimension(); • void StartInstrumentation(addr src_start, addr src_end, • addr tgt_start, addr tgt_end, src_to_tgt_mapping map, bb_info bb); void InitDimension(); _____________ __________ _____________ __________ ____ ____ VEE Dimension Initializer Initialization Assistant Instrumentation Repository Dispatcher Translator Instrumentation Assistant Instrumenter Code Cache Finalizer Finalization Assistant Auxiliary Code Cache
Instrumentation Algorithms Instrumentation Specification Dimension Instrumentation Repository Initialization Assistant Instrumentation Assistant Instrumenter Plan 1 Source Binary Plan 2 Basic Block Information Opt Plan Source-to-Target Mapping Target Binary Trampoline Finalization Assistant Auxiliary Code Cache
Instrumentation Algorithms Dimension Instrumentation Repository Initialization Assistant Instrumentation Specification Instrumentation Assistant Instrumenter Plan 1 Source Binary Plan 2 Basic Block Information Opt Plan Source-to-Target Mapping Target Binary Trampoline Finalization Assistant Auxiliary Code Cache
Optimizing Instrumentation • Instrumentation overhead and optimizations • Execute the jump which branches to the trampoline • Probe-coalescing [Kumar et al, PASTE ’05] • Parameters should remain available if coalesced • Perform the context-switch • Partial context-switch • Registers in most platforms • Transfer control to analysis routines • Analysis routine inlining • Only inline short ones to avoid code expansion • Execute analysis routines • Lightweight binary-to-binary optimization
Case Study • Strata [Scott et al, CGO ’03] • SPARC/Solaris • Single-entry translation units • Mainly one-to-one mapping from source binary to target binary, except for some control-transfer instructions • Jikes RVM [Arnold et al, OOPSLA ’02] • IA-32/Linux • Multiple-entry translation units – basic block information provided • Mapping from bytecode to machine code is maintained • Interface insertion points are easily located
Scenario Dimension SPARC Strata Initialize SPARC to SPARC Instrument Finalize C C Stub Functions Library Binary-Editing Utility Library …… Java …… IA-32 MIPS
Scenario Dimension Bytecode Jikes RVM Initialize IA-32 Java Bytecode to IA-32 Instrument Java Finalize Java Java C Stub Functions Library Binary-Editing Utility Library …… Java …… IA-32 MIPS
Evaluation • Experiments • Effectiveness of optimizations • Inlining, partial context-switch, probe coalescing • Calculating the average integer-add instructions executed in each basic block • Generality versus efficiency • Dimension versus Jazz • Branch coverage testing • Comparison in traditional execution environments • Strata-Dimension versus Valgrind, DynamoRIO and Pin • Basic block counting • The data for Valgrind, DynamoRIO and Pin is from [luk, PLDI ’05]
Effectiveness of Optimizations 8.6x 6.4x 2.6x 2.0x Target binary instrumentation for Strata
Effectiveness of Optimizations 2.4x 2.1x 1.4x 1.1x Target binary instrumentation for Jikes RVM
Effectiveness of Optimizations 1.7x 1.5x 1.1x 1.2x Source binary instrumentation for Jikes RVM
Evaluation • Experiments • Effectiveness of optimizations • Inlining, partial context-switch, probe coalescing • Calculating the average integer-add instructions executed in each basic block • Generality versus efficiency • Dimension versus Jazz [Misurda et al, ICSE ‘05] • Branch coverage testing • Comparison in traditional execution environments • Strata-Dimension versus Valgrind, DynamoRIO and Pin • Basic block counting • The data for Valgrind, DynamoRIO and Pin is from [luk, PLDI ’05]
Generality versus efficiency Comparison of slowdown from instrumentation between Jazz and Dimension
Evaluation • Experiments • Effectiveness of optimizations • Inlining, partial context-switch, probe coalescing • Calculating the average integer-add instructions executed in each basic block • Generality versus efficiency • Dimension versus Jazz • Branch coverage testing • Comparison in traditional execution environments • Strata-Dimension versus three dynamic instrumentation systems • Valgrind [Nethercote, Ph.D. thesis, Univ. of Cambridge, 2004 ] • DynamoRIO [Bruening et al, CGO ‘03] • Pin [Luk et al, PLDI ‘05] • Basic block counting • The data for Valgrind, DynamoRIO and Pin is from [Luk et al, PLDI ’05]
Comparison in traditional execution environments 7.5x 4.9x 2.3x 2.6x Comparison of slowdown from instrumentation in traditional execution environments
Related Work • Binary instrumentation systems developed for traditional execution environments • Static instrumentation systems • ATOM [Srivastava et al, PLDI ’94] • Can not handle a VEE’s target binary which is generated on-the-fly • Dynamic instrumentation systems • DTrace [Cantrill et al, OSDI ’04], Pin [Luk et al, PLDI ’05] • Can not handle a VEE’s source binary if it is non-executable • Binary instrumentation systems designed for VEEs • DynamoRIO [Bruening et al, CGO ’03] • FIST [Kumar et al, WOSS ’04] • Tightly bound with a specific VEE • Can not instrument both the source and target binaries
Conclusion • Dimension – first standalone instrumentation tool specially designed for VEEs • Easy to be used by different VEEs • Generality does not impact efficiency • Reasonable instrumentation overhead compared to other systems ?
Instrumentation Specification 1 FILE *trace; 2 3 // Called when program begins 4 EXPORT void DIM_ProgramBegin() { 5 trace = fopen("trace.out", "w"); 6 DIM_InsertBBCall(SOURCE, ENTRY, 7 FUNCPTR(record_bb), ARG_BB_ADDR, ARG_END); 8 } 9 10 // Called when program ends 11 EXPORT void DIM_ProgramEnd() { 12 fclose(trace); 13 } 14 15 // Print a basic block record 16 void record_bb(void *addr) { 17 fprintf(trace, "%p\n", addr); 18 }
Probe-Based Technique • Replace each instruction with a jump that branches to a trampoline, which is a code sequence that does: • Execute the original instruction • Perform a context-switch • Prepare the parameters for the analysis routine • Transfer control to the analysis routine • Problems with variable-length ISAs • A jump is longer than the original instruction • A jump replaces several instructions • Each instrumentation unit should have a single entry at its top • The instrumentation unit is shorter than the size of a jump • Use a shorter but expensive instruction instead of a jump
Reconfiguration • For new architectures that VEEs are executing on • Binary-editing utility library • Provide general binary-editing services to Dimension • For new languages used in VEE implementation • Dimension is written in C and compiled as a shared object • If a VEE is not implemented in C, stub functions are needed to call C functions, e.g., Java native interface • Parameter wrapping in stub functions, e.g., Java • Dimension needs no direct modification
Future Work • Overcome the ISA and VEE restrictions • Fixed-length ISA: limit offset of a jump • Variable-length ISA: short instrumentation unit problem • VEE: fragment patching • Determine the information by its own • Basic block information and source-to-target mapping • Automatic reconfiguration • Binary-editing utility library and stub functions • High-level contexts capture • An arbitrary local variable in a Java bytecode method
Acknowledgements • This paper benefited from fruitful discussions with Naveen Kumar and Jonathan Misurda • We also thank the anonymous reviewers for their useful suggestions and comments on how to improve the work