300 likes | 502 Views
Interlanguage Working Without Tears: Blending SML with Java. Andrew Kennedy Nick Benton Microsoft Research Cambridge. Goal. Fun for functional programmers: GUIs, 3-d, sound, video, email, crypto, imaging, server-side code, phone, TV, … Achieved by an interface between SML and Java.
E N D
Interlanguage Working Without Tears:Blending SML with Java Andrew Kennedy Nick Benton Microsoft Research Cambridge
Goal • Fun for functional programmers: • GUIs, 3-d, sound, video, email, crypto, imaging, server-side code, phone, TV, … • Achieved by an interface between SML and Java. • Implemented in MLj, a compiler that generates Java class files from SML’97 source.
Three approaches to interop • Bilateral interface with marshalling and explicit calling conventions (e.g. JNI, O’Caml interface for C). • Multilateral interface with IDL (e.g. COM, CORBA) together with particular language mappings (e.g. H/Direct, Caml COM, MCORBA). • Language integration (MLj).
1. Explicit Bilateral Interface • Two languages have distinct type systems and calling conventions. • Interface by: • Marshalling data between “compatible” types (e.g. java.lang.String to const char* by copying). Often restricted to a subset of the type system. • Giving directives for exporting and importing functions with language-specific calling conventions (e.g. _pascal _cdecl). • Usually tied to particular compiler implementations (e.g. SML/NJ and MLWorks have different C interfaces). • Realistically used only by experts.
Example: JNI JNIEXPORT jstring JNICALL Java_Prompt_getLine(JNIEnv *env, jobject obj, jstring prompt) { char buf[128]; const char *str = (*env)->GetStringUTFChars(env,prompt,0); printf("%s", str);(*env)->ReleaseStringUTFChars(env, prompt, str); …scanf("%s", buf); return (*env)->NewStringUTF(env, buf);}
2. IDL-based interop • Idea: • Use a language-independent interface definition language (IDL) to describe the signatures of functions that are to be called across the border. • Generate stub code using a language-specific tool. • Good because it separates the interface from the language and supports multilateral interop. • But: the programmer has to write IDL code.
3. Our approach: Integration • Idea: • When the “semantic gap” between two languages is small, integrate features of one language into the other. • If done well, can be used by novices. • But: language (and perhaps implementation) specific.
Our languages: SML & Java • Both languages are strongly typed with good correspondences: • Numeric types match closely • Strings are immutable vectors • Arrays have run-time sizing and bounds checking • Neither language has explicit pointer types • Both languages have automatic storage management. • Exception handling in both languages is similar. • But: there are significant differences too.
Interop in MLj 0.1 • The bolt-it-on approach: SML Java
Interop in MLj 0.1 • The bolt-it-on approach: • Name Java types (Java.int, “java.util.Vector”) and provide coercions between ML and Java types (e.g. Java.fromInt, Java.toInt) • Provide new constructs corresponding to some Java language constructs (in fact, often closer to JVM bytecodes) MLj 0.1
Example _public _method "handleEvent" (e : Event option) : Java.boolean = if Java.toInt(_getfield "id" (valOf e)) = Java.toInt(_getfield Event "WINDOW_DESTROY") then OS.Process.terminate OS.Process.success else _invoke "handleEvent" (_super, e) Response from some users: Ugh!
New design for MLj • The blending approach: SML Java
New design for MLj • The blending approach: • Don’t just attempt to replicate Java constructs • Instead: • re-use SML concepts where appropriate • invent clean new syntax elsewhere MLj 1.0
Design goals • Simplicity • Lightweight syntax • Easy to convert Java code into MLj • Compatibility: • SML’97 programs typecheck and run without change • Safety: • Java-style type safety + avoid NullPointerException • Power: • Improve on Java where possible
A non-goal • To pass ML-specific values into Java • It’s less useful – write code in MLj instead • It could compromise safety (e.g. by mutating ML values) • It requires uniform data representations, but we want the chance to optimise the representations
Example code open javax.swing java.awt java.awt.event_classtype SampleApplet () : JApplet () with local val prefix = "Counter: “ val count = ref 0 val label = JLabel (prefix ^ "0", JLabel.CENTER) fun makeButton (title, increment) = let val button = JButton (title:string)val listener =ActionListener () with actionPerformed (e : ActionEvent option) = (count := !count + increment; label.#setText(prefix ^ Int.toString (!count))) end in button.#addActionListener(listener); button endin init () = letval SOME pane = this.#getContentPane ()val button1 = makeButton ("Add One", 1)val button2 = makeButton ("Add Two", 2) in pane.#add(button1, BorderLayout.WEST); pane.#add(label, BorderLayout.CENTER); pane.#add(button2, BorderLayout.EAST) end end
Analogies between ML and Java multiple args tuple void unit non-static methods null NONE casts static field val binding non-static fields static method fun binding package structure instanceof class name type identifier class defs mutability ref SML Java import open private fields local decs
Null values • Java reference values (arrays & objects) can take the value null • ML doesn’t have this notion, so values of array and class types are interpreted as “non-null instance” • Then datatype ‘a option = NONE | SOME of ‘ais used for possibly-null objects and arrays
Fields and methods • Final fields (Java’s “const”) = ML values • Non-final fields = ML refs • Methods are given function types with • Tuples for multiple args • Unit for void arg and result • Implicit Java-style casts on arguments + T to T option
Fields and methods, cont. • Static fields & methods are just bindings in ML structures (= Java class) embedded in a hierarchy of structures (= Java packages) • Non-static members are accessed through .# notation • Constructors are just bindings with the same name as the type (= Java new C) • Improving on Java: first-class fields and methods e.g. • val colours = map (valOf o java.awt.Color.getColor) [“red”,“green”] • val labels = map javax.swing.JLabel [“ICFP”,”PLI”]
Casts and typecase • Java-style upcasts, using Caml-like syntaxval c = Jbutton (“My button”) :> Component • Also used for downcasts, but neater alternative is “cast patterns”:case (e : Expr) of ce :> CondExpr => … | ae :> AssignExpr => …
Creating Java classes in ML • Export an ML structure as a class, with functional values interpreted as static methods, non-functional values interpreted as static fields • New _classtype construct
Example _classtype Point(xinit, yinit)With local val x = ref xinit val y = ref yinitin getX() = !x and getY() = !y and move(xinc,yinc) = (x := !x+xinc; y := !y+yinc) and moveHoriz xinc = this.#move(0, yinc) and moveVert yinc = this.#move(xinc, 0)end
Example • Single constructor, with args used throughout definition (as in O’Caml) • No fields! (Instead, use local definitions) • Use of this as in Java _classtype Point(xinit, yinit)With localval x = ref xinit val y = ref yinitin getX() = !x and getY() = !y and move(xinc,yinc) = (x := !x+xinc; y := !y+yinc) and moveHoriz xinc = this.#move(0, yinc) and moveVert yinc = this.#move(xinc, 0)end
Example, cont. _classtype ColouredPoint(x,y,c) : Point(x,y)with getColour() = c : java.awt.Color and move (xinc,yinc) = this.##move(xinc*2, yinc*2)end
Example, cont. _classtype ColouredPoint(x,y,c) : Point(x,y)with getColour() = c : java.awt.Color and move (xinc,yinc) = this.##move(xinc*2, yinc*2)end • Superclass specified with arguments to its constructor • Overriding of methods • Special syntax for superclass method invocation
Finale • Classic functional techniques: • Backtracking & lazy lists to solve Eight Queens • Combinators for music (à la Hudak) • Interpreted using Java multimedia libraries…
Conclusion • Language interop is hard to get right – it’s a language design problem like any other • We think we’ve done a good job! • See the paper for formalisation in the style of the Definition of Standard ML • Main line of future work: better inference • Currently, some programs with unique typings are rejected because types are inferred on-the-fly • Instead, first do pass over term generating constraints, then solve them. • Available soon in MLj – for now, see http://www.dcs.ed.ac.uk/~mlj