590 likes | 755 Views
CS 242. 2008. Review. John Mitchell. Final Exam Wednesday Dec 10 12:15-3:15 PM Gates B01. Thanks!. Additional Lectures Kathleen Fisher Teaching Assistants Jason Bau, Ankur Taly, Bear Travis Graders Deepa Mahajan, Lingfeng Yang, and Lily Huang. Announcements.
E N D
CS 242 2008 Review John Mitchell Final Exam Wednesday Dec 10 12:15-3:15 PM Gates B01
Thanks! • Additional Lectures • Kathleen Fisher • Teaching Assistants • Jason Bau, Ankur Taly, Bear Travis • Graders • Deepa Mahajan, Lingfeng Yang, and Lily Huang
Announcements • Today – last homework due 5PM • Homework graded tomorrow, available Friday • Friday – no discussion section • Unless outpouring of interest • Course evaluations • Please do online; JCM and Kathleen Fisher separately • Next week • Office hours • Final exam – Wed 12:15 in Gates B01 • Two pages of notes • Local SCPD students come to campus for exam • Remote SCPD students: please fax back right away !!
Course Goals • Understand how programming languages work • Appreciate trade-offs in language design • Be familiar with basic concepts so you can understand discussions about • Language features you haven’t used • Analysis and environment tools • Implementation costs and program efficiency • Language support for program development
General Themes in this Course • Language provides an abstract view of machine • We don’t see registers, length of instruction, etc. • We see functions, objects, threads, … • If programs don’t depend on implementation method, compiler writers can chose best implementation • Language design is full of difficult trade-offs • Expressiveness vs efficiency, ... • Important to decide what the language is for • Every feature requires implementation data structures and algorithms
Good languages designed for specific purpose • C: systems programming • Lisp: symbolic computation, automated reasoning • FP: functional programming, algebraic laws • ML: theorem proving • Clu, ML modules: modular programming • Simula: simulation • Smalltalk: Dynabook, • C++: add objects to C • Java: set-top box, internet programming • JavaScript: web applications
Good language design presents abstract machine • Lisp: cons cells, read-eval-print loop • FP: ?? • ML: functions are basic control structure, memory model includes closures and reference cells • C: the underlying machine + abstractions • Simula: activation records and stack; object references • Smalltalk: objects and methods • C++: ?? • Java: Java virtual machine ?? Classes and objects
Design Issues • Language design involves many trade-offs • space vs. time • efficiency vs. safety • efficiency vs. flexibility • efficiency vs. portability • static detection of type errors vs. flexibility • simplicity vs. "expressiveness“, etc. • These must be resolved in a manner that is • consistent with the language design goals • preserves the integrity of the abstract machine
Many program properties are undecidable (can't determine statically ) • Halting problem • nil pointer detection • alias detection • perfect garbage detection • etc. Static type systems • detect (some) program errors statically • can support more efficient implementations • are less flexible than either no type system or a dynamic one
Languages are still evolving • Object systems • Concurrency primitives • Abstract view of concurrent systems • Scripting languages, web site construction • Domain-specific languages • Aspect-oriented programming and many other “fads” • Every good idea is a fad until is sticks
Outline of the course • JavaScript • Block structure and activation records • Exceptions • Haskell • Pure functional prog. • Type inference • Type classes • I/O, Monads • Operational semantics • Modularity and objects • encapsulation • dynamic lookup • subtyping • inheritance • Simula and Smalltalk • Self • C++ • Java / Security • Concurrency
formal p is pointer to Point uninitialized ptr has value none pointer assignment Simula Point Class class Point(x,y); real x,y; begin boolean procedure equals(p); ref(Point) p; if p =/= none then equals := abs(x - p.x) + abs(y - p.y) < 0.00001 real procedure distance(p); ref(Point) p; if p == none then error else distance := sqrt(( x - p.x )**2 + (y - p.y) ** 2); end ***Point*** p :- new Point(1.0, 2.5); q :- new Point(2.0,3.5); if p.distance(q) > 2 then ...
access link real x 1.0 real y 2.5 proc equals proc distance Representation of objects Object is represented by activation record with access link to find global variables according to static scoping p code for equals code for distance
Derived classes in Simula • A class decl may be prefixed by a class name class A A class B A class C B class D • An object of a “prefixed class” is the concatenation of objects of each class in prefix • d :- new D(…) A part B part d D part
Subtyping • The type of an object is its class • The type associated with a subclass is treated as a subtype of the type assoc with superclass • Example: class A(…); ... A class B(…); ... ref (A) a :- new A(…) ref (B) b :- new B(…) a := b /* legal since B is subclass of A */ ... b := a /* also legal, but run-time test */
Smalltalk: Point class • Class definition written in tabular form class name Point super class Object class var pi instance var x y class messages and methods …names and code for methods... instance messages and methods …names and code for methods...
Run-time representation Point class Template Point object Method dictionary x newX:Y: y 2 ... draw 3 move ColorPoint class Template ColorPoint object Method dictionary x newX:Y:C: y 4 color color 5 draw red This is a schematic diagram meant to illustrate the main idea. Actual implementations may differ.
isEmpty, size, includes: , … at: add: remove: at:Put: sortBlock: … associationAt: replaceFrom:to:with: Collection Hierarchy Collection Indexed Set Updatable Dictionary Sorted collection Array Subtyping Inheritance
Self: Messages and Methods • When message is sent, object searched for slot with name. • If none found, all parents are searched. • Runtime error if more than one parent has a slot with the same name. • If slot is found, its contents evaluated and returned. • Runtime error if no slot found. clone … parent* print … parent* x 3 x:
Changing Parent Pointers frog jump … prince dance … eatFly … eatCake … p parent* parent*: p jump. p eatFly. p parent: prince. p dance. name Charles name:
Optimize method lookup Polymorphic Inline Caches • Typical call site has <10 distinct receiver types • So often can cache all receivers • At each call site, for each new receiver, extend patch code: • After some threshold, revert to simple inline cache (megamorphic site) • Order clauses by frequency • Inline short methods into PIC code if type = rectangle jump to method_rect if type = circle jump to method_circle call general_lookup
C++ Run-time representation Point object Point vtable Code for move vptr 3 x ColorPoint object ColorPoint vtable Code for move vptr x 5 Code for darken c blue Point p = new Pt(3); p->move(2); // (*(p->vptr[0]))(p,2) Point cp = new ColorPt(5,blue); cp->move(2); // (*(cp->vptr[0]))(cp,2)
circle shape circle circle shape shape shape circle Function subtyping • If circle <: shape, then C++ compilers recognize limited forms of function subtyping
A B C++ Multiple Inheritane C • Offset in vtbl is used in call to pb->f, since C::f may refer to A data that is above the pointer pb • Call to pc->g can proceed through C-as-B vtbl C object C-as-A vtbl & C::f 0 pa, pc vptr A object A data C-as-B vtbl pb vptr & B::g 0 B object B data & C::f C data
Java types Reference Types Object Object[ ] Throwable Shape Shape[ ] Exception types Circle Square Circle[ ] Square[ ] user-defined arrays Primitive Types boolean int byte … float long
Array subtyping • Covariance • if S <: T then S[ ] <: T[ ] • Standard type error class A {…} class B extends A {…} B[ ] bArray = new B[10] A[ ] aArray = bArray // considered OK since B[] <: A[] aArray[0] = new A() // compiles, but run-time error // raises ArrayStoreException
class Stack { void push(Object o) { ... } Object pop() { ... } ...} String s = "Hello"; Stack st = new Stack(); ... st.push(s); ... s = (String) st.pop(); class Stack<A> { void push(A a) { ... } A pop() { ... } ...} String s = "Hello"; Stack<String> st = new Stack<String>(); st.push(s); ... s = st.pop(); Java 1.0 vs Generics
Example • Generic interface • Generic class implementing Collection interface class LinkedList<A> implements Collection<A> { protected class Node { A elt; Node next = null; Node (A elt) { this.elt = elt; } } ... } interface Collection<A> { public void add (A x); public Iterator<A> iterator (); } interface Iterator<E> { E next(); boolean hasNext(); }
F-bounded polymorphism • Generic interface interface Comparable<T> { public int compareTo(T arg); … } • Example public static <T extends Comparable<T>> T max(Collection<T> coll) { T candidate = coll.iterator().next(); for (T elt : coll) { if (candidate.compareTo(elt) < 0) candidate = elt; } return candidate; } candidate.compareTo : T int
The Java Virtual Machine (JVM) JVM network class loader instance class file verifier heap JIT class area primordial class loader execution engine local untrusted classes trusted classes native method area native method loader Security Manager native methods operating system native code Java code
JVM uses stack machine • Java Class A extends Object { int i void f(int val) { i = val + 1;} } • Bytecode Method void f(int) aload 0 ; object ref this iload 1 ; int val iconst 1 iadd ; add val +1 putfield #4 <Field int i> return JVM Activation Record local variables operandstack Return addr, exception info, Const pool res. data area refers to const pool
Bytecode rewriting: invokevirtual • After search, rewrite bytcode to use fixed offset into the vtable. No search on second execution. Bytecode Constant pool invokevirtual “A.foo()” inv_virt_quick vtable offset
Bytecode rewriting: invokeinterface Cache address of method; check class on second use Bytecode Constant pool invokeinterface “A.foo()” inv_int_quick “A.foo()”
Java 1.5 Implementation • Homogeneous implementation • Algorithm • replace class parameter <A> by Object, insert casts • if <A extends B>, replace A by B • Why choose this implementation? • Backward compatibility of distributed bytecode • Surprise: sometimes faster because class loading slow class Stack { void push(Object o) { ... } Object pop() { ... } ...} class Stack<A> { void push(A a) { ... } A pop() { ... } ...}
Some details that matter • Allocation of static variables • Heterogeneous: separate copy for each instance • Homogenous: one copy shared by all instances • Constructor of actual class parameter • Heterogeneous: class G<T> … T x = new T; • Homogenous: new T may just be Object ! • Creation of new object is not allowed in Java • Resolve overloading • Heterogeneous: resolve at instantiation time (C++) • Homogenous: no information about type parameter
Java Sandbox • Four complementary mechanisms • Class loader • Separate namespaces for separate class loaders • Associates protection domain with each class • Verifier and JVM run-time tests • NO unchecked casts or other type errors, NO array overflow • Preserves private, protected visibility levels • Security Manager • Called by library functions to decide if request is allowed • Uses protection domain associated with code, user policy • Coming up in a few slides: stack inspection
Verifier • Bytecode may not come from standard compiler • Evil hacker may write dangerous bytecode • Verifier checks correctness of bytecode • Every instruction must have a valid operation code • Every branch instruction must branch to the start of some other instruction, not middle of instruction • Every method must have a structurally correct signature • Every instruction obeys the Java type discipline Last condition is fairly complicated .
Vulnerabilities in JavaVM 45 40 35 30 25 Vulnerabilities Reported 20 15 10 5 0 0 1 2 3 4 5 6 7 8 9 July 1996 Years Since First Release July 2005 Slide: David Evans
Where are They? several of these were because of jsr complexity Slide: David Evans
Stack Inspection • Permission depends on • Permission of calling method • Permission of all methods above it on stack • Up to method that is trusted and asserts this trust Many details omitted here method f method g method h java.io.FileInputStream Stories: Netscape font / passwd bug; Shockwave plug-in
Concurrent language examples • Language Examples • Cobegin/coend • Multilisp futures (skip this year) • Actors • Concurrent ML (skip this year) • Java • Some features to compare • Thread creation • Communication • Concurrency control (synchronization and locking)
Actors [Hewitt, Agha, Tokoro, Yonezawa, ...] • Each actor (object) has a script • In response to input, actor may atomically • create new actors • initiate communication • change internal state • Communication is • Buffered, so no message is lost • Guaranteed to arrive, but not in sending order • Order-preserving communication is harder to implement • Programmer can build ordered primitive from unordered • Inefficient to have ordered communication when not needed
Java Concurrency • Threads • Create process by creating thread object • Communication • Shared variables • Method calls • Mutual exclusion and synchronization • Every object has a lock (inherited from class Object) • synchronized methods and blocks • Synchronization operations (inherited from class Object) • wait : pause current thread until another thread calls notify • notify : wake up waiting threads
Example synchronized methods [Lea] class LinkedCell { // Lisp-style cons cell containing protected double value; //value and link to next cell protected final LinkedCell next; public LinkedCell (double v, LinkedCell t) { value = v; next = t; } public synchronized double getValue() { return value; } public synchronized void setValue(double v) { value = v; // assignment not atomic } public LinkedCell next() { // no synch needed return next; }
Join, another form of synchronization • Wait for thread to terminate class Future extends Thread { private int result; public void run() { result = f(…); } public int getResult() { return result;} } … Future t = new future; t.start() // start new thread … t.join(); x = t.getResult(); // wait and get result
Stack<T>: produce, consume methods public synchronized void produce (T object) { stack.add(object); notify(); } public synchronized T consume () { while (stack.isEmpty()) { try { wait(); } catch (InterruptedException e) { } } Int lastElement = stack.size() - 1; T object = stack.get(lastElement); stack.remove(lastElement); return object; } Why is loop needed here? See: http://www1.coe.neu.edu/~jsmith/tutorial.html
How to make classes thread-safe • Synchronize critical sections • Make fields private • Synchronize sections that should not run concurrently • Make objects immutable • State cannot be changed after object is created public RGBColor invert() { RGBColor retVal = new RGBColor(255 - r, 255 - g, 255 - b); return retVal; } • Application of pure functional programming for concurrency • Use a thread-safe wrapper • New thread-safe class has objects of original class as fields • Wrapper class provides methods to access original class object
Limitations of Java 1.4 primitives • No way to back off from an attempt to acquire a lock • Cannot give up after waiting for a specified period of time • Cannot cancel a lock attempt after an interrupt • No way to alter the semantics of a lock • Reentrancy, read versus write protection, fairness, … • No access control for synchronization • Any method can perform synchronized(obj) for any object • Synchronization is done within methods and blocks • Limited to block-structured locking • Cannot acquire a lock in one method and release it in another See http://java.sun.com/developer/technicalArticles/J2SE/concurrency/
Java Memory Model • Semantics of multithreaded access to shared memory • Competitive threads access shared data • Can lead to data corruption • Need semantics for incorrectly synchronized programs • Determines • Which program transformations are allowed • Should not be too restrictive • Which program outputs may occur on correct implementation • Should not be too generous Reference: http://www.cs.umd.edu/users/pugh/java/memoryModel/jsr-133-faq.html
Program and locking order Thread 1 Thread 2 y = 1 lock M lock M lock sync program order program order i = x x = 1 unlock M unlock M j = y [Manson, Pugh]