Dependent Types in Practical Programming

Dependent Types inPractical Programming Hongwei Xi University of Cincinnati

Overview • Motivation • Program error detection at compile-time • Compilation certification • Proof carrying code (PCC) • Dependently typed programming languages • Design decisions • Dependent ML (functional) and Xanadu (imperative) • Theoretical development • Practical applications • Conclusion • Demo

Program Error Detection Unfortunately one often pays a price for [languages which impose no disciplines of types] in the time taken to find rather inscrutable bugs — anyone who mistakenly applies CDR to an atom in LISP and finds himself absurdly adding a property list to an integer, will know the symptoms. -- Robin Milner A Theory of Type Polymorphism in Programming Our work in this direction is inspired by and closely related to the work on refinement types (Davies, Freeman and Pfenning)

Some Advantages of Types • Detecting program errors at compile-time • Enabling compiler optimizations • Facilitating program verification • Using types to encode program properties • Verifying the encoded properties through type-checking • Serving as program documentation • Unlike informal comments, types are formally verified and can thus be fully trusted

Limitations of (Simple) Types • Not general enough • Many correct programs cannot be typed • For instance, downcasts are widely used in Java • Not specific enough • Many interesting program properties cannot be captured • For instance, types in Java cannot guarantee safe array access

Program Extraction Proof Synthesis Dependent ML Narrowing the Gap NuPrl Coq ML

Safe Array Subscripting • int(n): the singleton type for expressions of value n, where n ranges over integers • ‘a array(n): the type for arrays of size n, where n ranges over natural numbers • length: {n:nat} ‘a array(n) -> int(n) • sub: {i:nat,n:nat | i < n} ‘a array(n) * int(i) -> ‘a • update:{i:nat,n:nat | i < n} ‘a array(n) * int(i) * ‘a -> unit

Dot Product in DML fun dotprod (u, v) =let fun loop (i, len, sum) = if i = len then sum else loop (i+1, len, sum + sub(u,i) * sub(v,i))in loop (0, length (u), 0)end withtype {i:nat | i <= n} int (i) * int(n) * int -> int withtype {n:nat} int array(n) * int array(n) -> int

A Type for Arrays • A polymorphic type for arraysrecord<‘a> array { size: int; data[]: ‘a } • But this does not enforce that the integer stored insizeis the size of the array to which data points size data

A Dependent Type for Arrays • A polymorphic type for arrays{n:nat}record <‘a> array(n) { size: int(n); data[n]: ‘a}

Dot Product in Xanadu int dp (u: <int>array , v: <int>array ) { var: int i, sum;; sum = 0; for (i=0; i < u.size; i = i+1) { sum = sum + u.data[i] * v.data[i]; } return sum; } {n:nat} (n) (n) invariant: [a:int | a >= 0] (i: int(a))

Some Design Decisions • Practical type-checking • Realistic programming features • Conservative extension • Pay-only-if-you-use policy

ML0: start point base typesd ::= int | bool | (user defined datatypes) types t ::= d | t1t2| t1 *t2 patterns p ::= x | c(p) | <> | <p1, p2> match clauses ms ::= (p  e) | (p  e | ms) expressionse ::= x | f | c | if (e, e1, e2) | <> | <e1, e2> | lam x:t. e | fix f:t. e | e1(e2) | let x=e1 in e2 end | case e of ms values v ::= x | c | <v1, v2> | lam x:t. e context G ::= . | G, x: t

Integer Constraint Domain We use a for index variables index expressionsi, j ::= a | c | i + j | i – j | i * j | i / j | … index propositionsP, Q ::= i < j | i <= j | i > j | i >= j | i = j | i <> j | P  Q | P Q index sorts g::= int | {a : g | P } index variable contexts f ::= . | f, a: g | f, P index constraints F ::= P | P F| a: g.F

Dependent Types dependent typest ::= ... | d(i) | Pa: g.t | Sa: g.t For instance,int(0), bool array(16); nat = [a:int | a >= 0] int(a); {a:int | a >= 0} int list(a) -> int list(a)

DML0: ML0 + dependent types expressionse ::= ... |la: g.v | e[i] |<i | e> | open e1 as <a |x> in e2 end values v ::= ... | la: g.v | <i | v> typing judgment f; G |- e: t

Some Typing Rules (cont’d) f;G |-e: bool(i) f,i=1; G |-e1: t f,i=0; G |-e2: t ------------------------(type-if)f; G |-if (e, e1, e2): t

Erasure: from DML0 to ML0 The erasure function erases all syntax related to type index | bool(1) | = |bool(0)| = bool | [a:int | a >= 0]int(a) | = int | {n:nat} ‘a list(n) -> ‘a list(n) | =‘a list -> ‘a list | open e1 as <a | x> in e2 end | =let x = |e1| in |e2| end

Elaboration for DML0 • Elaboration is a mapping that maps an external language into an internal language • We have constructed an algorithm doing elaboration for DML0 • We have proven the correctness of the algorithm

An Example of Elaboration fun zip ([], []) = []| zip (x :: xs, y :: ys) = (x, y) :: zip (xs, ys) withtype {n:nat} ‘a list(n) * ‘b list(n) -> (‘a * ‘b) list(n) fun zip[0] ([], []) = [] | zip[a+1] (cons[a] (x, xs), cons[a] (y, ys)) = cons[a] ((x, y), zip[a] (xs, ys)) withtype {n:nat} ‘a list(n) * ‘b list(n) -> (‘a * ‘b) list(n)

A Sample Constraint • The following constraint is generated during type-checking the zip function: p:nat. q:nat. p + 1 = n  q + 1 = n  p = q

A Use of Existential Types fun filter p [] = [] | filter p (x :: xs) = if p(x) then x :: (filter p xs) else filter p xs withtype(‘a -> bool) ->{n:nat} ‘a list(n) -> [m:nat | m <= n] ‘a list(m) ‘a list (* [m:nat] ‘a list(m) *)

Polymorphism • Polymorphism is largely orthogonal to dependent types • We have adopted a two phase type-checking algorithm

References and Exceptions • A straightforward combination of effects with dependent types leads to unsoundness • We have adopted a form of value restriction to restore the soundness

Quicksort in DML fun qs [] = []| qs (x :: xs) = par (x, xs, [], []) and par (x, [], l, g) = qs (l) @ (x :: qs (g))| par (x, y :: ys, l, g) = if y <= x then par (x, ys, y :: l, g) else par (x, ys, l, y :: g) withtype {n:nat} int list(n) -> int list(n) withtype {p:nat,q:nat,r:nat} int * int list(p) * int list(q) * int list(r) -> int list(p+q+r+1)

Quicksort in DML (cont’d) Note that qs(xs) is a permutation of xs datatype intlist with (nat, nat) = Nil(0,0)| {i:int,s:int,l:nat} Cons(s+i,l+1) of int(i) * intlist(s,l) Nil:intlist(0,0) Cons: {i:int,s:int,l:nat} int(i) * intlist(s,l) -> intlist(s+i,l+1) qs: {s:int,l:nat} intlist(s,l) -> intlist(s,l)

Binary Search in Xanadu {n:nat} int bs(key: int, vec: <int> array ) { var: int l, m, u, x;; l = 0; u = vec.size - 1; while (l <= u) { m = l + (u-l) / 2; x = vec.data[m]; if (x < key) { l = m + 1; } else if (x > key) { u = m - 1; } else { return m; } } return –1; } (n) invariant: [i:int,j:int | 0<=i<=j+1<=n] (l:int(i), u:int(j))

Termination Verification • Termination • is a liveness property • can not be verified at run-time • is often proven with a well-founded metric that decreases whenever a recursive function call is made

Ackermann Function in DML fun ack (m, n) =if m = 0 then n+1else if n = 0 then ack (m-1, 1) else ack (m-1, ack (m, n-1)) withtype{m:nat,n:nat} int(m) * int(n) -> int <m,n> =>

Metric Typing Judgments Definition (Metric) Let m = <i1,...,in> be a tuple of index expressions. We write f|- m: metric if we have f |- ij:nat for 1  j  n. We use Pa:g.mtfor a decorated type We use the judgemnt f; G |- e:t <<f m0to mean that for each occurrence of f[i] in e, m[a->i] <m0 holds, where f is declared in G to have type Pa:g.mt

Some Metric Typing Rules • The rule (<<-app) is:f; G |- e1: t1 t2<<f m0f; G |- e2: t1<<f m0-------------------------------------------------f; G |- e1(e2): t1 t2<<f m0 • The rule (<<-lab) is:f |- i: g f |- m[a->i] <m0G(f) = Pa:g.mt-------------------------------------------------f; G |- f[i]:t[a->i] <<f m0

DML0,m The following typing rule is for forming functions: f,a:g; G,f: Pa:g.mt |- e: t<<f m----------------------------------(type-fun)f; G |- fun f[a:g] is e: Pa:g.t

Reducibility Definition Suppose that e is a closed expression of typetand e * v holds for some value v. • tis a base type. Then e is reducible • t=t1t2. Then e is reducible if e(v1) is reducible for every reducible value v1of type t1. • t = t1 *t2. Then e is reducible if v=<v1, v2> and v1, v2 are reducible. • t =Pa: g.t1. Then e is reducible if e[i] is reducible for every i:g. • t =Sa: g.t1. Then e is reducible if v=<i | v1> and v1 is reducible.

m-reducibility Definition Let e be a well-typed closed function fun f[a:g]:mt is v and m0 be a closed metric. e ism0-reducible if e[i] is reducible for each i:g satisfying m[a->i] <m0. Theorem Every closed expression e is reducible if it is well-typed in DML0,m

Quicksort in DML fun qs [] = []| qs (x :: xs) = par (x, xs, [], []) withtype {n:nat} int list(n) -> int list(n) and par (x, [], l, g) = qs (l) @ (x :: qs (g))| par (x, y :: ys, l, g) = if y <= x then par (x, ys, y :: l, g) else par (x, ys, l, y :: g) withtype{p:nat,q:nat,r:nat}int * int list(p) * int list(q) * int list(r) ->int list(p+q+r+1) <n,0> => <p+q+r,p+1> =>

Ongoing Research • Compilation certification • Dependently typed assembly language (with Robert Harper) • Proof construction for proof-carrying code

compilation |.| Compiler Correctness • How can we prove the correctness of a (realistic) compiler? • Verifying that the semantics of e is the same as the semantics of |e| for every program e • But this simply seems too challenging (and is unlikely to be feasible) Source program e Target code |e|

Semantics-preserving Compilation e -------------> |e| D of ev ----> |D| of |e||v| • This seems unlikely to be feasible in practice

compilation |.| Compilation Certification • Assume that P(e)holds, i.e., e has the property P • Then P(|e|) should hold, too • A compiler can be designed to produce a certificate to assert that |e| does have the property P Source program e: P(e) holds Target code |e|: P(|e|) holds

Type-preserving Compilation e --------------> |e|  e:t -----------> |e|:|t|   D of e:t ----> |D| of |e|:|t| • D and |D| are both represented in LF • The LF type-checker does all type-checking!

Poof-Carrying Code Verifying Executing Proof-Carrying Code Code Unpacking Proof Memory Safety Termination

compilation |.| proof translation Proof of P(e) Proof of P(|e|) Proof Construction Source program e Target code |e| • Building type derivations at source level with a practical type inference algorithm • Translating such type derivations into proofs at target level

Contributions • Novel language design • Reduction of type-checking to constraint satisfaction • Unobtrusive programming via elaboration • Solid theoretical foundation • Prototype implementation and evaluation

Related Work • Refinement types (Freeman, Davis & Pfenning) • Cayenne (Augustsson) • TALC Compiler (Morrisett et al at Cornell) • Safe C compiler: Touchstone (Necula & Lee) • TIL compiler (the Fox project at CMU) • FLINT compiler (Shao et al at Yale) • Secure Internet Programming (Appel, Felten et al at Princeton)

End of the Talk (Demo Time) • Thank You! Questions?

Dependent Types in Practical Programming