330 likes | 550 Views
Design and Implementation of the Joeq Virtual Machine. John Whaley Stanford University. Sun Microsystems Labs Mountain View, CA. August 26, 2003. About me. Worked on Java VMs since JDK 1.0 1996: Extended AWT to support pen input 1997: Clean-room Java VM written in C++
E N D
Design and Implementation of the Joeq Virtual Machine John Whaley Stanford University Sun Microsystems Labs Mountain View, CA August 26, 2003
About me • Worked on Java VMs since JDK 1.0 • 1996: Extended AWT to support pen input • 1997: Clean-room Java VM written in C++ • 1998: Jalapeno: designed opt compiler, … • 1999: MIT Flex: dataflow framework, etc. • 2000: IBM Tokyo JIT: x86 performance • 2001: joeq virtual machine Design and Implementation of the Joeq Virtual Machine
Key Features • Implemented in 100% Java • Includes native methods to manipulate addresses, memory, registers directly. • Native vs. hosted execution • Native: run directly on hardware • Hosted: run on top of another VM • Bootstrap to native via reflection • Supports both GC and explicit deallocation Design and Implementation of the Joeq Virtual Machine
Key Features • Compiler and program analysis framework • Multiple languages: Java, C, C++, … • Single intermediate representation • Static, quasi-static, and dynamic compilation • Single unified compiler infrastructure • Online and offline profiling system • M:N thread scheduler Design and Implementation of the Joeq Virtual Machine
Motivation/Purpose • Started Ph.D. studies, needed a research infrastructure • Purpose: • Try out new ideas • Do research • Publish papers • Not out to: • Compete with other VMs • Make a shippable product • Change the world Design and Implementation of the Joeq Virtual Machine
Other Options • SUIF • Written in C++ • Limited support for Java • No dynamic compilation or runtime system • EDG frontend: not 100% gcc compatible • Jalapeno • Written in Java • Very familiar with the system • Supports Java only • Not available outside of IBM Design and Implementation of the Joeq Virtual Machine
Other Options • MIT Flex compiler • Written in Java • Familiar with system • Open-source GPL • Statically-compiled Java only • Kaffe, etc. • Written in C • Poor design, poor performance Design and Implementation of the Joeq Virtual Machine
Why Another VM? • General problem with established projects: • Established users and code base made it difficult to make major changes. • Wanted to fix the design "mistakes" of Jalapeno and MIT Flex compiler • More productive in Java than in C++ Design and Implementation of the Joeq Virtual Machine
Design Goals • Ease of trying out new research ideas • Implemented in Java • Modularity. • Lots of reusable code, use of software patterns. • Support Java and C/C++ • A single intermediate representation • Support GC and explicit deallocation Design and Implementation of the Joeq Virtual Machine
Design Goals • Support static, quasi-static, dynamic compilation. • Unified compiler framework. • Compiler implemented in Java. • Allow "maybe" responses due to incomplete information. • General code patching mechanism. • Profile framework allows online/offline profiling. Design and Implementation of the Joeq Virtual Machine
Design Goals • Get something up and running quickly. • Make compiler, runtime easy to debug • Hijack class libraries from running VM • LGPL: can borrow code from other open-source projects • Goal: Self-bootstrapping after one month • Make it available for others to use. • Documentation, etc. Design and Implementation of the Joeq Virtual Machine
Not Design Goals • Performance leader • An endless pit, takes a lot of effort • Performance just needs to be “reasonable” • Should be designed for good performance if someone wanted to put in the effort • 100% conformance to specification • If programs work, that’s good enough. • No access to good test suites, anyway. Design and Implementation of the Joeq Virtual Machine
System Overview Design and Implementation of the Joeq Virtual Machine
Consequences of 100% Java • Implementation purity • Self-applicable • VM code is great for program analysis, makes a great test suite • Portability • >95% of the code is system-independent • Hosted execution • Easier software engineering • Exceptions, GC, software patterns, existing tools Design and Implementation of the Joeq Virtual Machine
Consequences of 100% Java • Java is not a panacea of portability • Hosted execution works OK on most VMs • Native bootstrapping is horribly VM-dependent • Internal class library changes cause Joeq to break • Supporting multiple JDK versions is difficult Design and Implementation of the Joeq Virtual Machine
Bootstrapping technique • Use reflection and code analysis to determine root set of methods and objects • Dump the objects and code into an object file (COFF or ELF format) • Use a standard linker to generate an executable • Easy support for static and quasi-static compilation, cross-language calls, dynamic linking, etc. Design and Implementation of the Joeq Virtual Machine
Bootstrapping trickiness • Custom class loaders • Have to hijack class loader and wrap it • Files, etc. must be reinitialized • Some state stored in native code • Objects created during image write • Finalizer threads, reflection caches, character encodings, … • Reflection doesn’t work on all objects • Throwable backtrace, ThreadLocal, etc. Design and Implementation of the Joeq Virtual Machine
Consequences of bootstrapping technique • Standard file formats very useful • Use existing tools and debuggers • Big startup time improvement on applications (30x) • Skips all of the initialization code, JIT startup costs • Large object files, number of relocations cause problems with some tools. Design and Implementation of the Joeq Virtual Machine
Consequences of bootstrapping technique • Automatic discovery of necessary code: time-consuming, too conservative. • Hardwired class list: smaller and faster, but breaks often. • Problem: Instantiating an object means class is initialized, which brings in class initializer and many more objects Design and Implementation of the Joeq Virtual Machine
Consequences of bootstrapping technique • Bootstrapping process is a major pain • Time-consuming: reflection is inefficient • Difficult to debug • Process breaks with different JDK versions, environment variables, command line options, locales, etc. Design and Implementation of the Joeq Virtual Machine
Class library implementation • GNU Classpath: too incompatible, too buggy • Hijack Sun class library by class merging • Make a “mirror” class with the same name. • Special class loader merges the classes. • Easy implementation of native methods. • Native code is just normal Java code. • Perfect compatibility, easy updates Design and Implementation of the Joeq Virtual Machine
Consequences of mirror classes • Types don’t match, so javac complains • Cast to java.lang.Object, then back down. • Doesn’t work on different class libraries. • Many changes between subversions. • Use a hierarchy of mirror classes • Incompatible changes lead to many hacks. Design and Implementation of the Joeq Virtual Machine
Multiple language support • Joeq has support for: • Java class files • SUIF files • C, C++, Fortran, … • x86 object code • All are translated into a single intermediate representation, the Quad. Design and Implementation of the Joeq Virtual Machine
Quad intermediate representation • Analyses and optimizations are instantly applicable to all languages • Cross-language inlining and optimization • Elimination of JNI overhead • Support for raw address manipulation in Java falls out naturally • Type-accurate garbage collection for well-behaved C/C++ programs Design and Implementation of the Joeq Virtual Machine
Quad intermediate representation • Generic interfaces for operators • Lots of shared code • Types are optional • Type analysis will construct type information • Doesn’t support all esoteric C/C++ features • Computed labels, C++ nastiness, etc. Design and Implementation of the Joeq Virtual Machine
Hierarchy of Operators Design and Implementation of the Joeq Virtual Machine
Memory management • Memory management is abstracted into different heaps • Each heap has its own allocation/deallocation policy • Interface for querying garbage collection policies • Type-accurate, semi-accurate, conservative • GC-safe points or at any instruction • Thread-local allocation pools • Working out an interface with JMTk Design and Implementation of the Joeq Virtual Machine
Consequences of memory management framework • Debugging • Run under hosted execution mode • Image snapshots • 100% type-accurate is hard • Coordinating threads for GC • Making a general interface is tricky Design and Implementation of the Joeq Virtual Machine
Thread scheduler • M:N thread scheduler • Lightweight Java threads • Thread switch at any instruction • Uses local thread queues and work-stealing • Timer ticks by using setitimer interrupts (Linux) or a separate thread (Windows) • Thread-local information stored off of fs register Design and Implementation of the Joeq Virtual Machine
Consequences of Java thread scheduler • Accessing threads in a machine-independent way is not easy • Linux pthread implementation is broken • Lots of bugs, race conditions, inefficiencies • Changing stack pointer is not always supported • Use of fs register is not always supported • Windows support is much nicer (?) Design and Implementation of the Joeq Virtual Machine
Running an Open-Source Project • Lots of interest, but very few people actually follow thru • Not many people have the skills • Of those, not many have the time • Of those, even fewer have the perseverance • The result is that there have only been minor contributions by others • Documentation, testing, file releases, updating the web site all take time. Design and Implementation of the Joeq Virtual Machine
Running an Open-Source Project • What’s needed: • Nightly build scripts and regression testing • Implementation hackers • People interested in GC Design and Implementation of the Joeq Virtual Machine
Conclusion: What I’ve learned • Software patterns are useful • Joeq: 100K lines of code • Modular design is key • Trying out new type checker: ~2 hours • For maximum efficiency, design the system to be easily debuggable. • Preemptively eliminate obvious problems. • Its more fun to write code when you also write the compiler. Design and Implementation of the Joeq Virtual Machine