300 likes | 389 Views
CPS Transform for Dependent ML. Hongwei Xi University of Cincinnati and Carsten Schürmann Yale University. Overview. Motivation Program error detection at compile-time Compilation certification Dependent ML (DML) Programming Examples Theoretical Foundation CPS Transform for DML
E N D
CPS Transform forDependent ML Hongwei Xi University of Cincinnati and Carsten Schürmann Yale University
Overview • Motivation • Program error detection at compile-time • Compilation certification • Dependent ML (DML) • Programming Examples • Theoretical Foundation • CPS Transform for DML • Conclusion
Program Error Detection Unfortunately one often pays a price for [languages which impose no disciplines of types] in the time taken to find rather inscrutable bugs — anyone who mistakenly applies CDR to an atom in LISP and finds himself absurdly adding a property list to an integer, will know the symptoms. -- Robin Milner A Theory of Type Polymorphism in Programming Therefore, a stronger type discipline allows for capturing more program errors at compile-time.
Some Advantages of Types • Detecting program errors at compile-time • Enabling compiler optimizations • Facilitating program verification • Using types to encode program properties • Verifying the encoded properties through type-checking • Serving as program documentation • Unlike informal comments, types are formally verified and can thus be fully trusted
compilation |.| Compiler Correctness • How can we prove the correctness of a (realistic) compiler? • Verifying that the semantics of e is the same as the semantics of |e| for every program e • But this simply seems too challenging (and is unlikely to be feasible) Source program e Target code |e|
compilation |.| Compilation Certification • Assume that P(e)holds, i.e., e has the property P (e.g., memory safety, termination, etc.) • Then P(|e|) should hold, too • A compiler can be designed to produce a certificate to assert that |e| does have the property P Source program e: P(e) holds Target code |e|: P(|e|) holds
Semantics-preserving Compilation e -------------> |e| D ofev --> |D| of|e||v| • This seems unlikely to be feasible in practice
Type-preserving Compilation e --------------> |e| e:t -----------> |e|:|t| D of e:t ----> |D| of |e|:|t| • D and |D| are both represented in LF • The LF type-checker does all type-checking!
Limitations of (Simple) Types • Not general enough • Many correct programs cannot be typed • For instance, downcasts are widely used in Java • Not specific enough • Many interesting program properties cannot be captured • For instance, types in Java cannot guarantee safe array access
Program Extraction Proof Synthesis Dependent ML Narrowing the Gap NuPrl Coq ML
Some Design Decisions • Practical type-checking • Realistic programming features • Conservative extension • Pay-only-if-you-use policy
Ackermann Function in DML fun ack (m, n) =if m = 0 then n+1else if n = 0 then ack (m-1, 1) else ack (m-1, ack (m, n-1)) withtype{a:nat,b:nat} int(a) * int(b) -> nat (* Note: nat = [a:int | a >=0] int(a) *)
Binary Search in DML fun bs (vec, key) =let fun loop (l, u) = if l > u then –1 else let val m = (l + u) / 2 val x = sub (vec, m) (* m needs to be within bounds *) in if x = key then m else if x < key then loop (m+1, u) else loop (l, m-1) endin loop (0, length (vec) – 1) end (* length: {n:nat} ‘a array(n) -> int(n) *) (* sub: {n:nat,i:nat | i < n} ‘a array(n) * int(i) -> ‘a *) withtype {i:int,j:int | 0 <= i <= j+1 <= n} int(i) * int(j) -> int withtype {n:nat} ‘a array(n) * ‘a -> int
ML0: start point base typesd ::= int | bool | (user defined datatypes) types t ::= d | t1t2| t1 *t2 patterns p ::= x | c(p) | <> | <p1, p2> match clauses ms ::= (p e) | (p e | ms) expressionse ::= x | f | c | if (e, e1, e2) | <> | <e1, e2> | lam x:t. e | fix f:t. e | e1(e2) | let x=e1 in e2 end | case e of ms values v ::= x | c | <v1, v2> | lam x:t. e context G ::= . | G, x: t | G, f: t
Integer Constraint Domain We use a for index variables index expressionsi, j ::= a | c | i + j | i – j | i * j | i / j | … index propositionsP, Q ::= i < j | i <= j | i > j | i >= j | i = j | i <> j | P Q | P Q index sorts g::= int | {a : g | P } index variable contexts f ::= . | f, a: g | f, P index constraints F ::= P | P F| a: g.F
Dependent Types dependent typest ::= ... | d(i) | Pa: g.t | Sa: g.t For instance,int(0), bool array(16); nat = [a:int | a >= 0] int(a); {a:int | a >= 0} int list(a) -> int list(a)
DML0: ML0 + dependent types expressionse ::= ... |la: g.v | e[i] |<i | e> | open e1 as <a |x> in e2 end values v ::= ... | la: g.v | <i | v> typing judgment f; G |- e: t
Some Typing Rules f, a:g;G |- e: t ------------------(type-ilam) f;G |- la:g.e: Pa:g.t f;G |- la:g.e: Pa:g.t f|- i: g ------------------------(type-iapp)f; G |- e[i]: t[a:=i]
Some Typing Rules (cont’d) f; G |-e: t[a:=i] f |-i: g -------------------------(type-pack)f; G |-<i |e>: Sa:g.t f; G |-e1: Sa:g.t1f, a:g; G, x:t1 |- e2: t2---------------------------------(type-open)f;G |-open e1 as <a | x> in e2 end: t2
Some Typing Rules (cont’d) f;G |-e: bool(i) f,i=1; G |-e1: t f,i=0; G |-e2: t ------------------------(type-if)f; G |-if (e, e1, e2): t
Erasure: from DML0 to ML0 The erasure function erases all syntax related to type index | bool(1) | = |bool(0)| = bool | [a:int | a >= 0]int(a) | = int | {n:nat} ‘a list(n) -> ‘a list(n) | =‘a list -> ‘a list | open e1 as <a | x> in e2 end | =let x = |e1| in |e2| end
evaluation erasure erasure evaluation Relating DML0 to ML0 program:type in DML0 answer:type in DML0 |program|:|type| in ML0 |answer|:|type| in ML0 Type preservation holds in DML0 A program is already typable in ML0 if it is typable in DML0
Polymorphism • Polymorphism is largely orthogonal to dependent types • We have adopted a two phase type-checking algorithm
References and Exceptions • A straightforward combination of effects with dependent types leads to unsoundness • We have adopted a form of value restriction to restore the soundness
Quicksort in DML fun qs [] = []| qs (x :: xs) = par (x, xs, [], []) withtype {n:nat} int list(n) -> int list(n) and par (x, [], l, g) = qs (l) @ (x :: qs (g))| par (x, y :: ys, l, g) = if y <= x then par (x, ys, y :: l, g) else par (x, ys, l, y :: g) withtype{p:nat,q:nat,r:nat}int * int list(p) * int list(q) * int list(r) ->int list(p+q+r+1)
CPS transformation for DML (I) Transformation on types: || d(i) ||* = d(i) || t1 -> t2 || = || t1||* -> || t2 || ||P a:g. t||* = P a:g. ||t||* ||S a:g. t||* = S a:g. ||t||* ||t|| = ||t||* -> ans -> ans (* ans is some newly introduced type *)
CPS Transformation for DML (II) Transformation on expressions: ||c||* = c ||x||* = x ||le x.e||* = le x. ||e|| ||v|| = le k. k(||v||*) ||e1(e2)|| = le k.||e1||(le x1.||e2||(le x2 . x1 (x2)(k)) ||e[i]|| = le k.||e||(le x . x[i](k)) ||fix f.v|| = le k.(fix f. ||v||)(k) ... ...
CPS Transformation for DML Theorem Assume D :: f; G |-e : t. Then D can be transformed to ||D|| such that ||D|| :: f; ||G|||- ||e|| : ||t||,where||G||(x) = ||G(x)||* and ||G||(f) = ||G(f)|| for all x,f in the domain of G . This theorem can be readily encoded into LF We have done this in Twelf.
Contributions • A CPS transform for DML • The transform can be lifted to the level of typing derivation • The notion of typing derivation compilation • A novel approach to compilation certification in the presence of dependent types
End of the Talk • Thank You! Questions?