300 likes | 438 Views
Optimizing dynamic dispatch with fine-grained state tracking. Salikh Zakirov, Shigeru Chiba and Etsuya Shibayama Tokyo Institute of Technology Dept. of Mathematical and Computing Sciences 2010-10-18. Mixin. code composition technique. BaseServer. BaseServer. Server. Server. Additional
E N D
Optimizing dynamic dispatch with fine-grained state tracking Salikh Zakirov, Shigeru Chiba and Etsuya Shibayama Tokyo Institute of Technology Dept. of Mathematical and Computing Sciences 2010-10-18
Mixin • code composition technique BaseServer BaseServer Server Server Additional Security Additional Security Mixin use declaration Mixin semantics
Dynamic mixin • Temporary change in class hierarchy • Available in Ruby, Python, JavaScript Server Server Additional Security BaseServer BaseServer
Dynamic mixin (2) • Powerful technique of dynamic languages • Enables • dynamic patching • dynamic monitoring • Can be used to implement • Aspect-oriented programming • Context-oriented programming • Widely used in Ruby, Python • e.g. Object-Relational Mapping
Dynamic mixin in Ruby • Ruby has dynamic mixin • but only “install”, no “remove” operation • “remove” can be implemented easily • 23 lines
Target application • Mixin is installed and removed frequently • Application server with dynamic features class BaseServer def process() … end end class Server < BaseServer def process() ifrequest.isSensitive() Server.class_eval { include AdditionalSecurity } end super # delegate to superclass … # remove mixin end end module AdditionalSecurity def process() … # security check super # delegate to superclass end end
Overhead is high Reasons • Invalidation granularity • clearing whole method cache • invalidating all inline caches • next calls require full method lookup • Inline caching saves just 1 target • which changes with mixin operations • even though mixin operations are mostly repeated
Our research problem • Improve performance of application which frequently uses dynamic mixin • Make invalidation granularity smaller • Make dynamic dispatch target cacheable in presence of dynamic mixin operations
Proposal • Reduce granularity of inline cache invalidation • Fine-grained state tracking • Cache multiple dispatch targets • Polymorphic inline caching • Enable cache reuse on repeated mixin installation and removal • Alternate caching
Basics: Inline caching consider a call site method implementation Dynamic dispatch implementation (executable code) Expensive! But the result is mostly the same method = lookup(cat, ”speak”) method(cat) Animal Cat speak() { … } subclass Inline caching cat.speak() cat.speak() Cat class if (cat has type ic.class) { ic.method(cat) } else { ic.method = lookup(cat, ”speak”) ic.class = cat.class ic.method(cat) } method speak ic instance cat
Inline caching: problem • What if the method has been overridden? Animal Cat speak() { … } Training speak(){ … } Inline caching cat.speak() class Cat if (cat has type ic.class) { ic.method(cat) } else { ic.method = lookup(cat, ”speak”) ic.class = cat.class ic.method(cat) } method speak ic instance cat
Inline caching: invalidation if (cat has type ic.class && state == ic.state) { ic.method(cat) } else { ic.method = lookup(cat, ”speak”) ic.class = cat.class; ic.state = state ic.method(cat) } 1 2 Global state Animal Cat speak() { … } Training speak(){ … } cat.speak() Cat class speak method speak ic • Single global state object • too coarse invalidation granularity instance state 1 2 cat
Fine-grained state tracking • Many state objects • small invalidation extent • share as much as possible • One state object for each family of methods called from the same call site • State objects associated with lookup path • links updated during method lookups • Invariant • Any change that may affect method dispatch must also trigger change of associated state object
State object allocation if (cat has type ic.class &&ic.pstate.state == ic.state ) { ic.method(cat) } else { ic.method, ic.pstate = lookup(cat, ”speak”, ic.pstate) ic.class = cat.class; ic.state = state method(cat) } inline caching code Animal cat.speak() Cat class speak() { *1* } speak*1* method ic 1 1 No implemmentation here 1 state Cat pstate speak
Mixin installation if (cat has type ic.class &&ic.pstate.state == ic.state ) { ic.method(cat) } else { ic.method, ic.pstate = lookup(cat, ”speak”, ic.pstate) ic.class = cat.class; ic.state = state method(cat) } inline caching code Training speak() { *2* } Animal cat.speak() Cat class speak() { *1* } speak *2* method speak*1* ic 1 2 2 1 1 2 state Cat pstate speak
Mixin removal if (cat has type ic.class &&ic.pstate.state == ic.state ) { ic.method(cat) } else { ic.method, ic.pstate = lookup(cat, ”speak”, ic.pstate) ic.class = cat.class; ic.state = state method(cat) } inline caching code Training speak() { *2* } Animal cat.speak() Cat class speak() { *1* } method speak*1* speak *2* ic 3 3 2 state 2 3 2 Cat pstate speak
Alternate caching alternate cache • Detect repetition • Conflicts detected by state check super Training Animal speak 4 3 … Training speak() { *2* } Animal A cat.speak() Cat class speak() { *1* } speak *2* speak*1* method ic 3 4 4 3 state Cat pstate speak Inline cache contents oscillates
Polymorphic caching alternate cache • Use multiple entries in inline cache super Training Animal speak 4 3 … Training speak() { *2* } Animal cat.speak() class Cat Cat speak() { *1* } method *1* *2* ic 3 4 state 3 4 Cat pstate speak
State object merge animal executable code instance animal.speak() Training cat.speak() while(true) { S speak() { *2* } Animal speak() { *1* } remove mixin } Overridden by Q Q instance Cat cat speak One-time invalidation
Overheads of proposed scheme • Increased memory use • 1 state object per polymorphic method family • additional methodentries • alternate cache • polymorphic inline cache entries • Some operations become slower • Lookup needs to track and update state objects • Explicit state object checks on method dispatch
Generalizations (beyond Ruby) • Delegation object model • track arbitrary delegation pointer change • Thread-local delegation • allow for thread-local modification of delegation pointer • by having thread-local state object values • Details in the article…
Evaluation • Implementation based on Ruby 1.9.2 • Hardware • Intel Core i7 860 2.8 GHz
Evaluation: microbenchmarks • Single method call overhead • Inline cache hit • state checks 1% • polymorphic inline caching 49% overhead • Full lookup • 2x slowdown
Dynamic mixin-heavy microbenchmark (smaller is better)
Evaluation: application • Application server with dynamic mixin on each request (smaller is better)
Evaluation • Fine-grained state tracking considerably reduces overhead • Alternate caching brings only small improvement • Number of call sites affected by mixin is low • Lookup cost / inline cache hit cost is low • about 1.6x on Ruby
Related work • Dependency tracking in Self • focused on reducing recompilation, rather than reducing method lookups • Inline caching for Objective-C • state object associated with method, no dynamic mixin support
Conclusion • We proposed combination of techniques • Fine-grained state tracking • Alternate caching • Polymorphic inline caching • To increase efficiency of inline caching • with frequent dynamic mixin installation and removal
Method caching in Ruby • Global hashtable • indexed by method name and class • On method lookup • gives answer in 1 hash lookup • On miss • answer obtained by recursive lookup • result stored in method cache • On method redefinition or mixin operation • method cache cleared completely