1 / 90

Bytecode Instrumentation Revealed Session 3256

Bytecode Instrumentation Revealed Session 3256. Joseph Coha Enterprise Java Lab Hewlett Packard Company. Learning Goals. Understand bytecode instrumentation (BCI) Use BCI as a tool to answer questions about your application’s performance. Agenda. What is bytecode instrumentation?

kina
Download Presentation

Bytecode Instrumentation Revealed Session 3256

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Bytecode Instrumentation RevealedSession 3256 Joseph Coha Enterprise Java Lab Hewlett Packard Company

  2. Learning Goals • Understand bytecode instrumentation (BCI) • Use BCI as a tool to answer questions about your application’s performance

  3. Agenda • What is bytecode instrumentation? • Class file format • What’s hard about implementation? • Examples of how it’s done • Transformations • Selecting methods to instrument • Collecting the data • Experience • Summary

  4. Bytecode Instrumentation • Modification of the class file content • Based on early object code modification work • Applied to Java™ as bytecode manipulation • Support for optimization, generics, AOP, … • JVM™ hooks • -Xprep, preprocessor as a property • Special class loader hooks • Java Virtual Machine Tools Interface (JVMTI) • Includes callback mechanism for class loading • Enables bytecode instrumentation • JVMTI (JSR 163) in J2SE 5.0

  5. Publicly Available BCI Tools • ASM: ObjectWeb Consortium – __asm__ • BCEL: Apache – Bytecode Engineering Library • BIT: University of Colorado – Bytecode Instrumenting Tool • Javassist: JBoss – Java Programming Assistant • java_crw_demo: J2SE 5.0 JVMTI class load • jclasslib: JProfiler – ej-technologies • JikesBT: IBM – Jikes Bytecode Toolkit • JOIE: Duke University • Others …

  6. Products Using BCI • Borland – Optimizeit ServerTrace • Borland – Optimizeit Profiler • HP – OpenView Transaction Analyzer (OVTA) • Wily Technology – Introscope • AdventNet – JMX + BCI • Eclipse • JUnit • Profiler • AOP: AspectJ, AspectWerkz • Most Java profilers: JProbe, JProfiler, …

  7. Java Class File Constant Pool cp_info { u1 tag; u1 info[]; } CONSTANT_Methodref_info { u1 tag; u2 class_index; u2 name_and_type_index; } CONSTANT_NameAndType_info { u1 tag; u2 name_index; u2 descriptor_index; } CONSTANT_Utf8_info { u1 tag; u2 length; u1 bytes[length]; } Header Constant Pool Access Flags, this, super 5 bytes Implemented Interfaces Fields Methods Class Attributes

  8. Java Class File Constant Pool

  9. Header Constant Pool Access Flags, this, super Implemented Interfaces Fields Methods Class Attributes Java Class File – Methods method_info { u2 access_flags; u2 name_index; u2 descriptor_index; u2 attributes_count; attribute_info attributes[attributes_count]; } code_attribute { u2 attribute_name_index; u4 attribute_length; u2 max_stack; u2 max_locals; u4 code_length; u1 code[code_length]; u2 exception_table_length; { u2 start_pc; u2 end_pc; u2 handler_pc; u2 catch_type; } exception_table[exception_table_length]; u2 attributes_count; attribute_info attributes[attributes_count]; }

  10. Bytecode Instructions Interesting instructions and their side-effects • Control flow • goto, jsr, ret, if*, *cmp*, athrow • Table jumping: lookupswitch, tableswitch • Challenging because of the jump table representation • Targets are indices into the array of bytecodes • Method invocation • Method call: invokestatic, invokevirtual, invokespecial, invokeinterface • Method return: return, {I,l,f,d,a}return • Stack operations and loads/stores • Push, pop, dup, load, *load, *store, … • max_stack must be updated with control flow information after traversal of the basic blocks

  11. Bytecode Instructions Interesting instructions and their side-effects • Field access • get/put field and static • Arithmetic operations • Impact the stack size • Type conversions and checks • checkcast, typecast, instanceof • Object allocation • new • Arrays: newarray, anewarray, multianewarray

  12. When Ahead-of-time / static At run-time / dynamic Load time Arbitrary time Requires class loader interaction (pre 5.0) RedefineClasses() in 5.0 Scope Whole program Selective Class Method Operations Replacement Insertion Deletion BCI Transformations

  13. Parsing the Class File • Goal: Rapid parsing and low memory usage • Methods: • Ad hoc • Intermediate representation (IR) • Abstraction of class file elements • All elements as objects: BCEL • Fly weight constants for immutables (stack ops) • Lightweight alternatives • Visitor pattern for simple transformations: ASM • Must provide good public interface for chosen audiences • BCEL targets bytecode experts • Javassist targets Java programmers

  14. BCEL UML Overview JavaClass and Code details From Dahm, et. al.

  15. Agenda • What is bytecode instrumentation? • Class file format • What’s hard about implementation? • Examples of how it’s done • Transformations • Selecting methods to instrument • Collecting the data • Experience • Summary

  16. What’s Hard? • Creating an intermediate form • Deserialization • All objects as intermediate representation • Less costly alternatives – bookeeping overhead • Size of intermediate representation • Applying multiple transformations • May require update to abstract representation with each change • Look for rewind() methods • Keeping the class file verifiable • Consistency • Size

  17. What’s Hard? • Class Pool size bloat • Only adding items to the Class Pool • “Dead entry” elimination requires whole class analysis • Method size bloat • Multiple return points • Exception unwinding through frames • Keeping entry/exit data consistent • Instrumenting classes early – BCI in Java • Making BCI work – Debugging BCI written in C

  18. What’s Hard? • Making BCI fast • Required for adaptive profiling systems • Inserting correct instrumentation • Correctness • Verification • Consistency: max stack, max locals, sizes, … • Thread safety • Race conditions • Deadlocks • “Stale” frames when re-instrumenting

  19. Creating Instrumentation Hooks (pre-5.0) • Experience with BCEL and class loaders [McPhail] • Custom class loader • Finds, instruments, and loads class before system class loader • System classes (java.* and javax.*) still loaded by the system class loader • Type problem with classes loaded by both class loaders – ClassCastException

  20. Creating Instrumentation Hooks (pre-5.0) • Hidden classpath • Follows delegation model for class loading • Requires the classes be found through some mechanism other than the classpath • Does not work if software expects to find classes or resources in the classpath • Remedy by creating class loader with parent set to getClass().getClassLoader() instead of the system class loader

  21. Agenda • What is bytecode instrumentation? • Class file format • What’s hard about implementation? • Examples of how it’s done • Transformations • Selecting methods to instrument • Collecting the data • Experience • Summary

  22. BCI Instrumentation • Traversal of elements • Filter that selects instrumentation site • Generator inserts transformation

  23. Approaches for Method Instrumentation • Rename • Entry point renamed • Wrap • Original method’s name • Add instrumentation • Call original target method • Insert • Directly modify the method/class • Mixin • Add field, interface, method to class • Add prefix/suffix to method(s)

  24. JRat (BCEL-based) jrat.sourceforge.net Wrapping for collection of timing information

  25. java_crw_demo BCI Library • C library in 5.0 JRE used by HPROF and other JVMTI agents • BCI support: [O’Hair] • Class initialization • Entry to the java.lang.Object init method (signature "()V") • Inject invokestatic call to tclass.obj_init_method(object) • Method instrumentation • Entry • Inject invokestatic call to tclass.call_method(class_num,method_num) • Map the class_num and method_num using the crw library • Return (each site) • Inject invokestatic call to tclass.return_method(class_num,method_num) • newarray type opcode • Duplicate array object on the stack • Inject invokestatic call to tclass.newarray_method(object)

  26. java_crw_demo BCI Library • Non-instrumented methods • init methods • finalize methods whose length is 1 • "system" classes clinit methods • java.lang.Thread.currentThread() • Modifications • No methods or fields added to any class • Only add new constant pool entries at end of original constant pool table • Exception, line, and local variable tables for each method adjusted • Bytecode optimization to use • Smaller offsets • Fewest 'wide' opcodes • Goals • Minimize number of bytecodes at each injection site • Classes with N return opcodes or N newarray opcodes will get N injections • Input arguments to java_crw_demo determine injections made

  27. java_crw_demo Example: Constant Pool Addition • fillin_cpool_entry • Write the information as a constant pool entry • add_new_cpool_entry • Call to fillin_cpool_entry • add_new_method_cpool_entry • Call to add UTF8 name to constant pool • Call to add UTF8 descr index to constant pool • Call to add name type to constant pool • Call to add method type to constant pool • cpool_setup (index 0 not in pool) • add_new_method_cpool_entry • inject_class • cpool_setup

  28. Add the runtime tracking method to the Constant Pool static CrwCpoolIndex add_new_method_cpool_entry(CrwClassImage *ci, CrwCpoolIndex class_index, const char *name, const char *descr) { CrwCpoolIndex name_index, descr_index, name_type_index; name_index = add_new_cpool_entry(ci, JVM_CONSTANT_Utf8, len, 0, name, len); descr_index = add_new_cpool_entry(ci, JVM_CONSTANT_Utf8, len, 0, descr, len); name_type_index = add_new_cpool_entry(ci, JVM_CONSTANT_NameAndType, name_index, descr_index, NULL, 0); return add_new_cpool_entry(ci, JVM_CONSTANT_Methodref, class_index, name_type_index, NULL, 0);

  29. java_crw_demo Example: Injection at Method Entry • injection_template • Insertion of the actual bytecodes • entry_injection_code - create injection code at entry to a method • injection_template • method_inject_and_write_code • entry_injection_code • Write bytecode image • method_write_bytecodes • method_inject_and_write_code • Adjust all offsets: • Code length • Maximum stack • Exception table • Code attributes • Attribute length

  30. Generating the bytecodes static ByteOffset injection_template(MethodImage *mi, ByteCode *bytecodes, ByteOffset max_nbytes, CrwCpoolIndex method_index) { max_stack = mi->max_stack + 2; nbytes += push_pool_constant_bytecodes( bytecodes+nbytes, ci->class_number_index); nbytes += push_short_constant_bytecodes( bytecodes+nbytes, mi->number); bytecodes[nbytes++] = (ByteCode)opc_invokestatic; bytecodes[nbytes++] = (ByteCode)(method_index >> 8); bytecodes[nbytes++] = (ByteCode)method_index; bytecodes[nbytes] = 0; /* Check max stack value */ if ( max_stack > mi->new_max_stack ) { mi->new_max_stack = max_stack; }

  31. Using java_crw_demo(): 19 Parameters • Caller assigned class number for class • Internal class name • Example: java/lang/Object (use “/” separator) • Pointer to class file image for the class • Length of the class file in bytes • Set to 1 if this is a system class • Prevents injections into empty <clinit>, finalize, and <init> methods • Class that has methods to call at the injection sites (tclass) • Signature of tclass • Format: "L" + tclass_name + ";" • Method name in tclass to call at offset 0 for every method • Signature of this method • Format: "(II)V"

  32. Using java_crw_demo • Method name in tclass to call at all return opcodes in every method • Signature of the method • Format: "(II)V" • Method name in tclass to call when injecting java.lang.Object.<init> • Signature of the method • Format: "(Ljava/lang/Object;)V" • Method name in tclass to call after every newarray opcode in every method • Signature of the method • Format: "(Ljava/lang/Object;II)V" • Returns a pointer to new class file image • Returns the length of the new image • Pointer to function to call on any fatal error • NULL sends error to stderr • Pointer to function that gets called with all details on methods in the class • NULL means skip this call

  33. java_crw_demo parameters JNIEXPORT void JNICALL java_crw_demo( unsigned class_number, const char *name, const unsigned char *file_image, long file_len, int system_class, char* tclass_name, char* tclass_sig, char* call_name, char* call_sig, char* return_name, char* return_sig, char* obj_init_name, char* obj_init_sig, char* newarray_name, char* newarray_sig, unsigned char **pnew_file_image, long *pnew_file_len, FatalErrorHandler fatal_error_handler, MethodNumberRegister mnum_callback );

  34. Start-up Using JVMTI Native Interface • Start-up • -agentlib:<agent-lib-name>=<options> • Agent_OnLoad() called when library loaded • Set-up the callbacks • jvmtiEventCallbacks callbacks; • Callbacks.ClassFileLoadHook = &cbClassFileLoadHook; • Load the java_crw_demo native BCI library • Application running • When class loaded (or reloaded): • Callback made to cbClassFileLoadHook() • Generate unique class identification number • Instrument bytecode • Supply Java method(s) to call from instrumentation • Return modified class file

  35. Class file load callback static void JNICALL cbClassFileLoadHook(jvmtiEnv *jvmti_env, JNIEnv* env, jclass class_being_redefined, jobject loader, const char* name, jobject protection_domain, jint class_data_len, const unsigned char* class_data, jint* new_class_data_len, unsigned char** new_class_data){ if ( gdata->bci_counter == 0 ) { class_prime_system_classes(); } gdata->bci_counter++; ClassIndex cnum; loader_index = loader_find_or_create(env,loader); if ( class_being_redefined != NULL ) { cnum = class_find_or_create(signature, loader_index); } else { cnum = class_create(signature, loader_index); } Generating the class index

  36. ((JavaCrwDemo)(gdata->java_crw_demo_function))( cnum, name, class_data, class_data_len, system_class, "sun/tools/hprof/Tracker", "Lsun/tools/hprof/Tracker;", "CallSite", "(II)V", NULL, NULL, NULL, NULL, NULL, NULL, &new_image, &new_length, &my_crw_fatal_error_handler, &class_set_methods); Call BCI library to instrument Class with method Method to call

  37. if ( new_length > 0 ) { unsigned char *jvmti_space; jvmti_space = (unsigned char *) jvmtiAllocate((jint)new_length); (void)memcpy((void*)jvmti_space, (void*)new_image, (int)new_length); *new_class_data_len = (jint)new_length; *new_class_data = jvmti_space; } else … Return the new class file

  38. Agenda • What is bytecode instrumentation? • Class file format • What’s hard about implementation? • Examples of how it’s done • Transformations • Selecting methods to instrument • Collecting the data • Experience • Summary

  39. Method Selection Constraints • BCI library determines constraints • Avoid circular dependencies with Java library • Load-time instrumentation problem • JVMPI • Java-based BCI • No instrumentation of BCI classes • Non-issue with ahead-of-time instrumentation • Dynamic instrumentation adds significant flexibility • Allows class redefinition after first use

  40. Method Selection Functional Goals • Depends on technical demands of environment • Transaction measurement tool • Instrument top-level J2EE classes • Development-time tool • Instrument user-level code • Filters to select classes/methods • Application server • Third party classes • AOP tool • Method selection for function • Special logging • Special debug support

  41. Method Selection Performance Goals • Overall goal: Minimal performance impact • Runtime overhead • BCI injection • Modified control flow • BCI data collection

  42. Method Selection Performance Solutions • Optimize injected code aggressively • Filtering opportunities to minimize overhead • Accessor methods • Small methods • Third party code • Well-understood, application-specific code • Make most expensive instrumentation conditional • Sample application performance • Turn on data collection selectively

  43. Method SelectionNew Opportunities • JVMTI allows staged instrumentation • Make conservative decisions initially • Remove or add instrumentation as data is collected and application understood • Adaptive opportunities • Efficient profiling • Targeted optimizations • Find and fix bugs

  44. Agenda • What is bytecode instrumentation? • Class file format • What’s hard about implementation? • Examples of how it’s done • Transformations • Selecting methods to instrument • Collecting the data • Experience • Summary

  45. Approaches for Data Collection • C heap data structures • Requires call to native code from Java • Java heap • Measurement data structures interfere with program • Files • I/O can be slow, even with buffering • Pipes • Sockets/Network • RMI, RMI/SSL • JMX

  46. Efficient Collection • Minimize method calls • Finding class and method names • Thread information • Native calls • Cross language calls • Keep critical sections short • Use thread local storage to avoid contention • Global data structures need locks

  47. Efficient Collection • Persist data that will no longer be modified • Data from thread that has exited can be written to file • Long term collection • Minimize total amount of data collected • Toggle on/off • Sample • Filter

  48. Agenda • What is bytecode instrumentation? • Class file format • What’s hard about implementation? • Examples of how it’s done • Transformations • Selecting methods to instrument • Collecting the data • Experience • Summary

  49. Demo • BCI in action • Using J2SE 5.0 java.lang.instrument 49

  50. java.lang.instrument

More Related