510 likes | 522 Views
Dive into the core mechanisms of Object-Oriented languages & understand the importance of Virtual Machines in modern programming. Discover how VMs enhance portability, security, & memory efficiency. Learn about compiling Smalltalk to bytecodes & decoding bytecode instructions. Uncover the strength of VMs in enabling security features, ease of memory handling, and executing reflective programs through an explicit model. Immerse yourself in the intricate workings of Virtual Machines.
E N D
Getting Under the Hood:Exploring Mechanisms of OO languages Object-Oriented Analysis
Agenda Virtual Machines • Why bother? • The Squeak VM • Comparison to Java VM Generic functions (guest lecture) Garbage collection (Monday) Goal: Either motivate you to tinker under the hood yourself or at least make you aware of the important mechanisms underlying many OO languages. Object-Oriented Analysis
Motivation Last lecture, we talked about how to explore performance issues in Squeak. OO and other modern languages have a lot of features to ease the transition from design ideas to implementation. These features appeal to programmer/designers. In the end, everything boils down to pushing around bits. How design features are implemented in OO languages appeal to another kind of programmer. Object-Oriented Analysis
What is a Virtual Machine? Tell me what you know? Object-Oriented Analysis
Virtual Machines • Why use a virtual machine? • portability, memory, security • How Smalltalk compiles and executes bytecodes • Specific details of Squeak’s VM • Object memory, primitives , garbage collection (later) • How Java’s VM differs • Support for native types Object-Oriented Analysis
Strengths of Virtual Machines Portability • Virtual machines allow for binary compatibility between platforms • A simple VM implementation allows for easy porting of the VM Memory handling • Programmer doesn’t explicitly allocate and de-allocate memory • Typically, no pointers • VM provides garbage collection Object-Oriented Analysis
Strengths of Virtual Machines Security • Not running on the bare machine, can be safer Explicit execution model • Easier to create reflective programs, meta-programming • Threads and control structures are available to programmer Object-Oriented Analysis
Strengths of Virtual Machines More memory efficient • Definitely an issue for low-power, low-memory consumer devices Can be faster startup • Side effect of more memory efficient For some activities, speed is a wash • Floating-point, network... Object-Oriented Analysis
Compiling Smalltalk to Bytecodes Bytecodes are the instructions of the Smalltalk virtual machine The Smalltalk virtual machine is not register-based (like Pentium, Sparc, etc.) • There are registers, but they’re not made visible to the programmer Instead, it’s stack-based • Parameters are pushed onto the stack, and then are manipulated from there Object-Oriented Analysis
Kinds of Bytecodes in the Smalltalk VM • Push bytecodes • Put things (like instance variables) on the stack for manipulation • Store bytecodes • From stack into object variable • Send bytecodes • Causes a message to be sent • Return bytecodes • Return top of stack, or nil, true, false, self • Jump bytecodes • For optimizing branches Object-Oriented Analysis
Shhh…It’s not all messages! Jump bytecodes allow us to optimize some operations Object-Oriented Analysis
Example Compilation of Smalltalk center ^origin + corner/2 Compute the center point of a Rectangle • It’s actually a little different in current Squeak • But if you type the above in, it will compile to the same bytecodes Bytecodes: 0, 1, 176, 119, 185, 124 Object-Oriented Analysis
Meaning of bytecodes 0: Push the value of self’s first instance variable (origin) onto the stack 1: Push the value of the receiver’s second instance variable (corner) onto the stack 176: Send + (result is left on the stack) 119: Push the SmallInteger 2 onto the stack 185: Send a binary message with the selector / (result is left on the stack) 124: Return the object on top of the stack as the value of the message center Object-Oriented Analysis
Methods go in a MethodDictionary Every method is compiled into a CompiledMethod The CompiledMethod is stored in a MethodDictionary (one for every class) with the selector of the method as the key Lookup of a message involves finding the class that can return a CompiledMethod for the sought after message selector Object-Oriented Analysis
Constants go in the Literal Frame Each method gets compiled into a CompiledMethod A CompiledMethod includes a header, the bytecodes, and a “literal frame” • The literal frame contains the constants from the method • Strings, numbers that are too big to be SmallIntegers, symbols for messages, etc. • Obviously, references to the literal frame are a bit slower because it’s another memory access Object-Oriented Analysis
An Example with Literals Transcript and the string are literals that appear (earlier) in the CompiledMethod Object-Oriented Analysis
What does that mean? Push on the stack the literal #Transcript Duplicate it (for that cascade in a moment) Push the constant ‘Hello World!’ Send the (literal) #show: Pop the stack (we don’t care about the result) Send the (literal) #cr Pop stack Return self Object-Oriented Analysis
What the VM Interpreter Knows • The CompiledMethod being executed • The location of the next bytecode to execute: Instruction pointer (IP) • The receiver (self) and arguments of the message to this method • Any temporary variables • A stack Object-Oriented Analysis
What the VM interpreter does • Fetch a bytecode from the CompiledMethod via the IP • Increment the IP • Perform the function specified Object-Oriented Analysis
Contexts When a message is sent, the active method execution has to be saved • That’s saved as a context • Context is all the state of the VM interpreter MethodContexts differ slightly from BlockContexts • BlockContexts are created at runtime during execution of a method Optimizing contexts is a big part of making a VM faster Object-Oriented Analysis
Primitives Some CompiledMethods may just point to primitives • Primitives are code inside the VM interpreter for handling things like: • Arithmetic • Storage management • Control • Input-Output Object-Oriented Analysis
Making Your Own Primitives You can use the CCodeGenerator to build your own primitives • You can even use it to generate stubs that you later rewrite with your own C The key challenge is dealing with the boxing and unboxing • Moving between native types and objects • There are issues of “glue”, conversions, etc. Object-Oriented Analysis
History of Primitives in Squeak Primitives were just 1…256 Named primitives allowed for external shared libraries (DLLs) Named primitives can now be generated with SLANG (System Language - minimal Smalltalk that can be easily generated using code generator) Glue generators also in development Lots more on Squeak Swiki Object-Oriented Analysis
Making a Primitive, Step 1 For example: Build a primitive that returns the sum of two input integers Step 1: Define the plugin class InterpreterPlugin subclass: #FooPlugin instanceVariableNames: '' classVariableNames: '' poolDictionaries: '' category: 'Werdna-Foostuff' Object-Oriented Analysis
Making a Primitive, Step 2 Step 2: Define the calling class and calling method Object subclass: #Foo instanceVariableNames: 'myInteger ' classVariableNames: '' poolDictionaries: '' category: 'Werdna-Foostuff' integerSum: firstInteger and: secondInteger "answer the sum of firstInteger and secondInteger" < primitive: 'primFooIntegerSumAnd' module: 'Foo'> ^FooPlugin doPrimitive: 'primFooIntegerSumAnd' Object-Oriented Analysis
What’s going on here < primitive: 'primitiveName' module: 'moduleName'> This will always fail until the primitive is defined, so execution will fall through to the next line ^FooPlugin doPrimitive: 'primFooIntegerSumAnd’ Which will allow us to test in Squeak Object-Oriented Analysis
Making a Primitive, Step 3 Step 3: Define the primitive method primFooIntegerSumAnd ":firstInteger and: secondInteger" "answer sum of (int) firstInteger and (int) secondInteger" |firstInteger secondInteger| self export: true. “Public function for extern use” self inline: false. “Don’t bother inlining for speed” secondInteger := interpreterProxy stackIntegerValue: 0. firstInteger := interpreterProxy stackIntegerValue: 1. interpreterProxy pop: 3. interpreterProxy pushInteger: (firstInteger+secondInteger). Object-Oriented Analysis
What’s all that code? It’s close enough to Smalltalk to execute! • inlining is compiling in the function rather than the call to the function. Optimization technique. • InterpreterProxy fills in for the Squeak VM, which we need to manipulate the context (e.g., push/pop operations) • What we’re doing is: • Get the arguments • Pop them and the receiver object off the stack • Push back on the stack the result Object-Oriented Analysis
Making a Primitive, Step 4 Step 4: Test • Believe it or not, it works as-is! • The PlugIn code handles arguments and other issues for you inside of Squeak Foo new integerSum: 3 and: 4 Object-Oriented Analysis
How did that work? doPrimitive: primitiveName | proxy plugin | proxy := InterpreterProxy new. proxy loadStackFrom: thisContext sender. plugin := self simulatorClass new. plugin setInterpreter: proxy. plugin perform: primitiveName asSymbol. ^proxy stackValue: 0 Object-Oriented Analysis
Making a Primitive, Step 5 Step 5: Tell the primitive its name and compile to C moduleName "return the name of this plug-in library" ^'Foo' FooPlugin translateDoInlining: true Object-Oriented Analysis
Making a Primitive, Step 6 Step 6: Compile your primitive! • You’ll need support files (e.g., .h files) InterpreterSupportCode writeMacSourceFiles • Generates them on the Mac Object-Oriented Analysis
What our primitive looks like in C EXPORT(int) primFooIntegerSumAnd(void) { int x; int y; int _return_value; x = interpreterProxy->stackIntegerValue(1); y = interpreterProxy->stackIntegerValue(0); if (interpreterProxy->failed()) { return null;} _return_value = interpreterProxy->integerObjectOf((x + y)); if (interpreterProxy->failed()) { return null;} interpreterProxy->popthenPush(3, _return_value); return null;} Object-Oriented Analysis
Object Memory ObjectMemory allows manipulation of objects as unique object pointers (oops) • Each oop is 32-bit • If bit 1 = 1, then the next 31 bits represent a SmallInteger in 2’s-complement notation • If bit 1=0, then it’s an address to an object Some Smalltalks make it an address to an ObjectTable ObjectMemory also implements garbage collection Object-Oriented Analysis
Object Header Each object contains 1-3 bytes of header information • 3 bits for garbage collection • 12 bits for hash value • 5 bits for compact class index or 30 bits for direct class oop • Size of object Object-Oriented Analysis
VM Optimizations in Squeak Compact classes are cached for easy access Smalltalk compactClassesArray Methods are cached since they’re often re-used Lookup of messages to methods are cached at two levels to prevent frequent message re-look-up at: accesses an atCache first to speed Array and similar references Object-Oriented Analysis
Exploring the Squeak VM VM Interpreter is in Interpreter • Can actually run it to simulate Squeak (InterpreterSimulator new openOn: Smalltalk imageName) test Object Memory is in ObjectMemory The VM is generated using the CCodeGenerator Interpreter translate: 'interp.c' doInlining: true. Object-Oriented Analysis
Example: Implementation of bytecodes duplicateTopBytecode self fetchNextBytecode. self internalPush: self internalStackTop. pushConstantMinusOneBytecode self fetchNextBytecode. self internalPush: ConstMinusOne. Object-Oriented Analysis
A Tour of the Java VM While we’re getting our hands dirty with bits, let’s talk about the Java VM • How does it differ? • What’s the same? Object-Oriented Analysis
Java VM also executes bytecodes Like Squeak, Java VM is stack-based Java VM bytecodes are typed • Java supports native data types, as well as objects • So we need integer push (iload) as well as float and double (fload, dload) But most of the rest are similar • Some special-purpose ones exist to optimize things like switch More Java bytecodes have operands Object-Oriented Analysis
Example of Java-to-bytecodes void spin() { int i; for (i = 0; i < 100; i++) { ; // Loop body is empty } } Object-Oriented Analysis
Bytecodes for Example 0 iconst_0 // Push int constant 0 1 istore_1 // Store into local variable 1 (i=0) 2 goto 8 // First time through don't increment 5 iinc 1 1 // Increment local variable 1 by 1 (i++) 8 iload_1 // Push local variable 1 (i) 9 bipush 100 // Push int constant 100 11 if_icmplt 5 // Compare and loop if less than (i < 100) 14 return // Return void when done Object-Oriented Analysis
Java VM makes threads more significant In Java VM, stacks are associated with threads, not with method contexts (frames) in Java • Frames are pushed and popped from the stack as methods come and go • The same thread always keeps the same stack There are bytecodes for monitorenter and monitorexit for synchronized{} blocks Every object has its own lock (semaphore) Object-Oriented Analysis
Java VM’s Exceptions Exceptions are handled within the VM When an exception occurs, current method frame is searched for handler If not there, pop the current frame, and search the next one on the stack, until one is found Advantage: Fast Disadvantage: Can’t continue Object-Oriented Analysis
Java Security Support Java specifies very exactly a class file format That class file format must be followed for the class to be executed Java VM does some bytecode verification The VM thus guarantees a level of security Object-Oriented Analysis
Java’s JIT (Just In Time Compilation) JIT compilation turns a method into a native code routine on-the-fly Native code routines are stored in a cache Hard to do well Easy to implement different semantics Not portable Can give you outstanding speed benefits But compilation itself takes time So, on some benchmarks, raw VM is better than JIT Issue: Is it still a VM anymore? Do you still get VM benefits? Object-Oriented Analysis
Comparing Squeak and Java VMs Bytecodes • Surprisingly similar! • In fact, Java can be compiled to ST bytecodes • But not vice-versa easily • Java bytecodes are somewhat more complex • Typed, more operands Garbage collection • Latest version of Java: Dead heat Object-Oriented Analysis
Squeak /Java VMs, Part 2 Exceptions • Squeak handles exceptions within Squeak • Java handles exceptions within VM • Java’s are faster • Squeaks are more flexibile (e.g., notification as an exception that then continues) Security • Definitely in Java’s favor Object-Oriented Analysis
Squeak/Java VMs, part 3 Threads/Processes • With Java, they’re all in the VM • Obviously, fast • More overhead in the VM • With Squeak, they’re done with Squeak and primitives • Are they slower? Don’t know-anyone want to measure? Object-Oriented Analysis
Conclusion: Squeak/Java VMs Java • With JIT, faster • More secure, better support for threads • Much more complex and harder to port to different platforms Squeak • Slower • More flexible • Easier to port Object-Oriented Analysis