Approaches to Reflective Method Invocation

Approaches to Reflective Method Invocation Dr. Ian Rogers, Dr. Jisheng Zhao, and Prof. Ian Watson The University of Manchester Third International Workshop on Implementation, Compilation, Optimization of Object-Oriented Languages, Programs and Systems (ICOOOLPS 2008) July 7, Paphos, Cyprus

Reflective Method Invocation • Motivation • Allow dynamic extension of applications by the use of methods/constructors not known at compile time • Uses include Java Beans, JNI code • Overheads • Creating representation • Invocation • Parameter boxing

Reflective Method Invocation

Implementation with out-of-line code • Out-of-line code is code that performs the bridge from regular Java bytecode to the dynamic method • Written in native code, C, assembler, etc. • Used in Jikes RVM – performance would indicate also used in IBM DK and BEA JRockit

Optimizing out-of-line code • Objects representing methods are immutable • Constant methods that are invoked can have the invocation and parameter boxing overheads eliminated • Constant methods are created by calls to pure routines or by chasing initialized final references

Bytecode generation • Bytecode to implement a reflective method call can be dynamically generated at runtime by creating a special class that performs the method invocation • Pros • Bytecode is interpreted so can boost performance of even interpreted code • Not reliant on finding method as a constant value • Cons • Cost of producing and storing bytecode • Used in Sun’s HotSpot VM

Eager and lazy bytecode generation • Eager • Generate class on construction of method object • Field holding generated object can be final • Lazy • Generate class on first method invocation • Use hashtable to hold object potentially avoiding storage overhead • Use of pure methods can eliminate hashtable lookup in opt compiled code

Synthetic performance simplification

DaCapo Performance

Conclusions • Maximum performance achievable by simplification or bytecode generation • Bytecode generation cheap enough to beat simplification • Eager bytecode generation gives best DaCapo execution time improvement • Lazy bytecode generation gives best DaCapo mean speed up

Pure Method Analysis withinJikes RVM Dr. Jisheng Zhao, Dr. Ian Rogers, Dr. Chris Kirkham and Prof. Ian Watson The University of Manchester Third International Workshop on Implementation, Compilation, Optimization of Object-Oriented Languages, Programs and Systems (ICOOOLPS 2008) July 7, Paphos, Cyprus

What’s a pure method • Greek next day of the week • Input day of week • Output next day of week

Literal argument gives another literal as a result • DEUTERA -> TRITH • TRITH -> TETARTH • TETARTH -> PEMPTH • PEMPTH -> PARASKEUH • PARASKEUH -> SABBATO • SABBATO -> KURIAKH • KURIAKH -> DEUTERA

How can we optimize? • getNextDay(TRITH)

How can we optimize? • getNextDay(TRITH) The answer can only be TETARTH

How can we optimize? • X = getToday(); • Y = getNextDay(X); • … • Z = getNextDay(X);

How can we optimize? • X = getToday(); • Y = getNextDay(X); • … • Z = getNextDay(X); Must generate the same result, so copy 1st result

Other optimizations • Escape analysis • Can eliminate synchronization if method object is passed to pure method • Dead code elimination • Unused results of pure methods that don’t throw exceptions can have instructions eliminated • Memoization

Knowing something is pure • Implementation of getNextDay may use a map or other potentially mutable data storage • Stationary field analysis, amongst others, shows this is unlikely [Unkel and Lam ’08, Rogers, Zhao and Watson ‘08]

Means of determining purity • Programmer provider annotations • Simple bytecode analysis • e.g. a method having a method call, load or store wouldn’t be pure • Optimizing compiler analysis • examine bytecode after optimization to determine purity

DaCapo mean speedup

Conclusions • Pure methods provide optimization opportunities • 1469 methods are determined to be pure in Jikes RVM boot image through simple analysis • Optimizing compiler analysis provides further runtime improvement • Runtime optimizing compiler analysis limited as few methods are compiled by optimizing compiler, simple analysis still possible

Related work • A. Salcianu and M. Rinard • Static offline analysis handling pointers to objects created within method • Haiying Xu, Christopher J. F. Pickett, and Clark Verbrugge • Multiple levels of pure-ness found through SOOT framework • Basis for memoization optimization in an interpreter that fails to regain the overhead

Boot Image Layout for Jikes RVM Dr. Ian Rogers, Dr. Jisheng Zhao, and Prof. Ian Watson The University of Manchester Third International Workshop on Implementation, Compilation, Optimization of Object-Oriented Languages, Programs and Systems (ICOOOLPS 2008) July 7, Paphos, Cyprus

What is boot image layout? • Boot image captures the state of a VM when it starts • This state includes • code to run when the VM starts • objects required for that execution • As no threads are active the only live objects are literals or static fields

Depth-first traversal Boot image static Foo foo Foo object String value1 String value2 String object char[] value int count int offset String object char[] value int count int offset char[] int length char…. char[] int length char….

Depth-first traversal Boot image static Foo foo char[] Foo object String value1 String value2 String object char[] value int count int offset String object char[] value int count int offset char[] int length char…. char[] int length char….

Depth-first traversal Boot image static Foo foo char[] Foo object String value1 String value2 String String object char[] value int count int offset String object char[] value int count int offset char[] int length char…. char[] int length char….

Depth-first traversal Boot image static Foo foo char[] Foo object String value1 String value2 String char[] String object char[] value int count int offset String object char[] value int count int offset char[] int length char…. char[] int length char….

Depth-first traversal Boot image static Foo foo char[] Foo object String value1 String value2 String char[] String String object char[] value int count int offset String object char[] value int count int offset char[] int length char…. char[] int length char….

Depth-first traversal Boot image static Foo foo char[] Foo object String value1 String value2 String char[] String Foo String object char[] value int count int offset String object char[] value int count int offset char[] int length char…. char[] int length char….

Problems • References within objects need to be scanned by a stop-the-world GC • References are distributed throughout boot image • Can optimize this by observing that only mutable references need to be scanned

Visualizing depth-first traversal’s references Red = Reference White = Non-reference 2666 pages contain references

Breadth-first traversal Boot image static Foo foo Foo Foo object String value1 String value2 Queue value1 value2 String object char[] value int count int offset String object char[] value int count int offset char[] int length char…. char[] int length char….

Breadth-first traversal Boot image static Foo foo Foo Foo object String value1 String value2 Queue Value2 value Queue value2 value String String object char[] value int count int offset String object char[] value int count int offset char[] int length char…. char[] int length char….

Breafth-first traversal Boot image static Foo foo Foo Foo object String value1 String value2 Queue value value Queue value value String String String object char[] value int count int offset String object char[] value int count int offset char[] int length char…. char[] int length char….

Breadth-first traversal Boot image static Foo foo Foo Foo object String value1 String value2 Queue value String String char[] String object char[] value int count int offset String object char[] value int count int offset char[] int length char…. char[] int length char….

Breadth-first traversal Boot image static Foo foo Foo Foo object String value1 String value2 Queue String String char[] char[] String object char[] value int count int offset String object char[] value int count int offset char[] int length char…. char[] int length char….

Visualizing breadth-first traversal’s references Red = Reference White = Non-reference 2017 pages contain references

Prioritized traversal • Breadth-first traversal queues references • The next element to be removed from a queue can be prioritized • Make the breadth first queue a prioritized queue and then implement comparators

Criteria for prioritization • Name of object’s class • Object’s type reference ID • Size of object • Number of references within object • Number of mutable references within object • Density – references ÷ object size • These criteria can be chained • the next comparator is used when the result of the comparison is identical • Future work: profiling, in particular allocation frequency

Different prioritization approaches and number of pages used

Results from prioritization • Name is a better prioritization criteria than type reference ID • Places arrays of primitive types together • Density ignoring final fields is better than using final fields • Best scheme can avoid references on ~5MB worth of pages (5,316,608bytes) when compared to depth-first traversal

Visualizing different approaches Depth-first Breadth-first Prioritized

Effect on performance

Conclusions • Boot image layout using a prioritized traversal can dramatically reduce the pages that are traversed during a stop-the-world GC • Effect on performance negligible • Careful selection of benchmark may demonstrate improvement

Related work • Layout of objects is considered for object inlining as well as for copying GCs • Xianglong Huang, Stephen M. Blackburn, Kathryn S. McKinley, J. Eliot B. Moss, Zhenlin Wang, and Perry Cheng. The garbage collection advantage: Improving program locality. • Wen ke Chen, Sanjay Bhansali, Trishul M. Chilimbi, Xiaofeng Gao, and Weihaw Chuang. Profile-guided proactive garbage collection for locality optimization. • Michael S. Lam, Paul R. Wilson, and Thomas G. Moher. Object type directed garbage collection to improve locality.

Approaches to Reflective Method Invocation

Approaches to Reflective Method Invocation

Presentation Transcript

Remote Method Invocation (RMI)

Remote Method Invocation

Remote Method Invocation (RMI)

Remote Method Invocation

Remote Method Invocation

RMI Remote Method Invocation

Closest Match Method Invocation

Remote Method Invocation

Remote Method Invocation

RMI, Remote Method Invocation

Remote Method Invocation (RMI)

Remote Method Invocation

RMI remote method invocation

Java Remote Method Invocation

Remote Method Invocation

Remote Method Invocation

Remote Method Invocation (RMI)

Remote Method Invocation

Remote Method Invocation