800 likes | 881 Views
Delve into the world of Just-In-Time (JIT) compilation, its definitions, benefits, limitations, security aspects, history, and applications in Java, mobile, and web environments. Learn about compilers, interpreters, and JIT's impact on performance and memory usage.
E N D
JIT Zak Sang, Laura Tebben
Agenda • Definitions: Compilers vs. interpreters vs. JIT • Timeline • Benefits of JIT • Limitations of JIT • Security of JIT • Applications • Java • Mobile • Web
Introduction From Wikipedia: “In computing, just-in-time (JIT) compilation (also dynamic translation or run-time compilations) is a way of executing computer code that involves compilation during execution of a program – at run time – rather than prior to execution. Most often, this consists of source code or more commonly bytecode translation to machine code, which is then executed directly.”
Compiler • Translates all source code to binary machine code upfront • Requires a lot of memory since object code is generated • Very performant since translation work is done ahead of time
Interpreter • Also translates source code to machine code • but does so at runtime • translation can be done at the line, statement, or instruction level • Requires less memory since object code is not produced • Instead machine code is generated as needed • Generally less performant than compiled programs because translation happens at runtime* • * optimization specific to running environment can be used to help this
JIT Compiler • Translates source code to an IR (eg: bytecode), which is dynamically translated to machine code at run time • Adds more optimization than interpreters • Combines benefits of interpreters and traditional compilers • Translation to IR gets some of the translation work out of the way and saves computation at runtime • Optimizations happen at run-time for frequently used portions of code • Platform-specific optimizations can be used • Allows for more flexibility
More Info on JIT • AOT compilation to bytecode • Parse source code, basic optimizations • Bytecode • Interpreted by VM • JIT compiler dynamically translated bytecode to native machine code • cached
Timeline - Compilers • 1950s - Early compiled languages used to allow for higher level languages • FORTRAN • 1957 - First commercially available compiler • COBOL • 1960 - First cross-platform compiler compiled to RVAC 501 and RCA II • Performance optimization • 1970 - BLISS - A systems language eventually overtaken by C • Late 1980s - Programming directly in assembly starts to decline
Timeline - Interpreters • 1952 - Interpreters introduced to improve working with limited computers • Limited program storage space • ~1960 - LISP ‘eval’ function implemented by Steve Russell • Evaluate individual LISP expressions • Theorized in design by John McCarthy • 1975 - Microsoft BASIC • Popular on personal computers (1970s-1980s)
Timeline - JIT Compilers • ~1960 - LISP • translation at runtime • related to interpreters • 1968 - Regex in QED editor (Ken Thompson) • 1983 - Smalltalk • translation on demand • with caching of compiled code • resulted in the Self dialect • ~1993 - Java • Patterns derived from Self/ Smalltalk
Timeline - JIT Compilers • ~2009 - JIT on the Web • Firefox 3.5 (TraceMonkey) • ~2009-2010 - JIT on Android • Tracing added to DVM in 2010 • ~2014 - JIT removed from Android • Replaced with native AOT compilation • Improved performance • Improved power consumption • 2016 - JIT added back to Android • Improves performance at runtime • Saves storage space for apps • Speeds up updates (since apps aren’t recompiled upfront on each update)
Compiled Code • Quick program startup time • Quick runtime • Compilation overhead incurred only once
Interpreted Code • Faster to develop in • More flexible (https://en.wikipedia.org/wiki/Interpreted_language) • Platform independent • Dynamic typing • Dynamic scoping • Reflection • Possibly a smaller executable program size
JIT - Storage • More portable since object code isn’t generated • Hardware agnostic • Bytecode is compact • Can potentially save a lot of memory because only parts of the program that are used are JIT-compiled
JIT - Performance • Faster startup time than compiling AOT • NOT faster than executing precompiled binary • Initial translation to IR makes JIT-compiled code faster than interpreters • Translation caching can help reduce the runtime translation effort • JIT compilation allows for platform-specific optimizations • Adaptive optimization uses runtime information to further optimize the generated machine code
JIT - Optimizations • Optimize to the CPU • Optimize to the version of the OS • Change compilation based on usage stats • Inline library functions • Can easily rearrange code for better cache utilization
Compiled Code • Lack of cross-platform support • Can’t optimize for runtime behavior • Slower to develop in compiled languages
Interpreted Code • Slower runtime than compiled code • Must ship source code • Prone to programmer error with regards to typing • More susceptible to code injection attacks
JIT - More complex implementation • Requires multiple steps • AOT compile to bytecode • Bytecode interpreted (HotSpot) • Compiled to machine code at runtime when threshold is hit • Recompiled with higher optimization when another threshold is hit • Deoptimized if assumptions change • Could drastically increase compilation time
JIT - Heavier runtime • Like interpreters, JIT compilers require more work to be done during runtime which may take resources away from the main program • Increased memory, CPU usage • Often results in slower run speeds • Delay when compiling to machine code
Security • JIT requires executable pages containing arbitrary code to be generated • Code can be crafted to generate nop sleds and arbitrary code • Can be used with an arbitrary code execution vulnerability to run the generated code
push var a load push pop push ... ... ...
push var a load push pop push ... ... ...
push var a load var b push pop push ... ... ...
push var a load push pop push ... ... ...
push var a load var c push pop push ... ... ...
push var a load var c push pop push ... ... ...
push var a load var c push pop push ... ... ...
push var a load var c push pop push ... ... ...
Security - Executable Pages • JIT requires writable pages to be executable since machine code is generated at runtime (data is code) • Now arbitrary code can be generated (JIT Spray) • http://www.semantiscope.com/research/BHDC2010/BHDC-2010-Paper.pdf • Potential Vulnerability: • Attacker finds buffer overflow • Buffer is constructed containing valid instructions • Attacker redirects execution to these instructions
https://media.blackhat.com/bh-us-11/Rohlf/BH_US_11_RohlfIvnitskiy_Attacking_Client_Side_JIT_Compilers_Slides.pdfhttps://media.blackhat.com/bh-us-11/Rohlf/BH_US_11_RohlfIvnitskiy_Attacking_Client_Side_JIT_Compilers_Slides.pdf
Security - Executable Pages in JIT (solutions) • Have pages only be writable or executable • https://jandemooij.nl/blog/2015/12/29/wx-jit-code-enabled-in-firefox/
Security - Executable Pages in JIT (solutions) • JIT Emission Randomization • Similar to ASLR, but for output of JIT compiler • Can randomize spacing and ordering of functions • Randomization may still be predictable
Security - Executable Pages in JIT (solutions) • Constant folding - break constants into scatter 2-byte chunks an reassemble at runtime • Can be defeated if scattering is predictable • Not effective on some instructions
Security - Executable Pages in JIT (solutions) • Constant blinding - xor constants with a secret • Used by most browsers
References • https://www.ibm.com/support/knowledgecenter/en/SSYKE2_8.0.0/com.ibm.java.vm.80.doc/docs/jit_overview.html • https://en.wikipedia.org/wiki/Just-in-time_compilation • https://www.ibm.com/support/knowledgecenter/en/SSYKE2_8.0.0/com.ibm.java.vm.80.doc/docs/jit_optimize.html • http://www.cs.columbia.edu/~aho/cs6998/Lectures/14-09-22_Croce_JIT.pdf • https://aboullaite.me/understanding-jit-compiler-just-in-time-compiler/ • https://en.wikipedia.org/wiki/Optimizing_compiler#History • https://en.wikipedia.org/wiki/History_of_compiler_construction • https://www.linuxjournal.com/forums/history-compilers • http://www.semantiscope.com/research/BHDC2010/BHDC-2010-Paper.pdf • https://jandemooij.nl/blog/2015/12/29/wx-jit-code-enabled-in-firefox/
References (contd) • https://media.blackhat.com/bh-us-11/Rohlf/BH_US_11_RohlfIvnitskiy_Attacking_Client_Side_JIT_Compilers_Slides.pdf • https://www.ibm.com/support/knowledgecenter/zosbasics/com.ibm.zos.zappldev/zappldev_85.htm • http://time.com/69316/basic/ • https://en.wikipedia.org/wiki/Interpreted_language • https://source.android.com/devices/tech/dalvik/jit-compiler.html • https://en.wikipedia.org/wiki/Android_Runtime
Agenda • Native (Java) • Mobile (Android) • Web (JavaScript)
The basics • Before runtime • Translates source code to IR (bytecode) • At runtime • Compiles bytecode to machine code • Determines semantics of individual bytecodes • Using trees • Saves the compilation results for future use
HotSpot JVM • Interpreter • Template-based • Maps machine code to each bytecode instruction • Tiered compilation • Client compiler (C1): • Focus is compilation speed • Limited set of optimizations • Server compiler (C2): • Focus is code performance • Heavy overhead
HotSpot JVM • C1 and C2 compilers • C1 • 3 phases • Front end constructs high level IR from bytecode (platform independent) • Uses static single assignment form to enable optimizations • Back end generates low level IR from high level IR (platform dependent) • Register allocation, optimization, machine code • C2 • Fully optimizing compiler • Uses static single assignment IR • Does typical compiler optimizations • Also some Java-specific optimizations • Null-check elimination, range-check elimination, etc