210 likes | 356 Views
Dependently Typed Pattern Matching. Hongwei Xi Boston University. Datatypes. Available in various functional programming languages such as SML and Haskell Convenience in programming Clarity in code. An Example: Random-Access Lists. Cons: O(log n) (Amortized: O(1))
E N D
Dependently Typed Pattern Matching Hongwei Xi Boston University
Datatypes • Available in various functional programming languages such as SML and Haskell • Convenience in programming • Clarity in code
An Example: Random-Access Lists • Cons: O(log n) (Amortized: O(1)) • Uncons: O(log n) (Amortized: O(1)) • Lookup operation: O(log n) • Update operation: O(log n)
Datatype for Random Lists • datatype ‘a ralist = Nil | One of ‘a| Even of ‘a ralist * ‘a ralist| Odd of ‘a ralist * ‘a ralist • L1: x1, …,xn; L2: y1, …, ynEven(L1, L2): x1, y1, …, xn, yn • L1: x1, …,xn, xn+1; L2: y1, …, ynOdd(L1, L2): x1, y1, …, xn, yn, xn+1
Some Inadequacies • Even should only be applied to two nonempty lists of equal length • Oddshould only be applied to two nonempty lists where the first list contains exactly one more element than the second one • Unfortunately, these invariants cannot be captured by the type system of ML
Dependent Datatypes for Random Lists • datatype ‘a ralist with nat = Nil(0) | One(1) of ‘a| {n:pos} Even(n+n) of ‘a ralist(n) * ‘a ralist(n)| {n:pos} Odd(n+n+1) of ‘a ralist(n+1) * ‘a ralist(n) • For instance, Even is given the type:{n:pos} ‘a ralist(n) * ‘a ralist(n) -> ‘a ralist(n+n)
uncons in Dependent ML(DML) • fun(‘a) uncons (One x) = (x, Nil)| uncons (Even (l1, l2)) = (case uncons l1 of (x, Nil) => (x, l2) | (x, l1) => (x, Odd (l2, l1))| uncons (Odd (l1, l2)) = let val (x, l1) = uncons l1 in (x, Even (l2, l1)) endwithtype {n:pos} ‘a ralist(n) -> ‘a * ‘a ralist(n-1)
Pattern Matching in DML • Nondeterministic at compile-time • Sequential at run-time • This can cause an annoying problem in DML: the previous code for uncons does not type-check
Mutually Disjoint Patterns • Note that:nondeterministic pattern matching is the same as sequential pattern matching if all patterns are disjoint • We can manually expand patterns into disjoint ones, but this may be inconvenient and error-prone
An Example of Expansion • (case uncons l1 of (x, Nil) => (x, l2) | (x, l1) => (x, Odd (l2, l1))is expanded into(case uncons l1 of (x, Nil) => (x, l2) | (x, l1 as One _) => (x, Odd (l2, l1)) | (x, l1 as Even _) => (x, Odd (l2, l1)) | (x, l1 as Odd _) => (x, Odd (l2, l1))
The Problem • Given patterns p, p1, …, pn,we intend to find a list patterns p’1, …, p’n’ such that a value v matches p but none of pi if and only if it matches one of p’j. • Note that p’1, …, p’n’ need not be disjoint. • An algorithm that generates the least n’ is said to be optimal.
The result • An algorithm, which is essentially based upon Laville’s work, is presented and proven to be optimal. • Note that this is an exponential algorithm. • We do handle datatypes with infinitely many constructors (integers).
A Motivating Example • fun restore (R(R t, y, c), z, d) = R(B t, y, B(c, z, d)) | restore (R(a, x, R(b, y, c)), z, d) = R(B (a, x, b), y, B(c, z, d)) | restore (a, x, R(R(b, y, c), z, d)) = R(B (a, x, b), y, B(c, z, d)) | restore (a, x, R(b, y, R t)) = R(B (a, x, b), y, B t) | restore t == B t (* == indicates the need for resolving sequentiality *)withtype … • The last clause in the above definition needs to be expanded into 36 ones in order to type-check.
Exhaustiveness of Patterns • datatype ‘a list with nat = nil(0) | {n:nat} cons(n+1) of ‘a * ‘a list(n) • fun(‘a, ‘b) zip ([], []) = [] | zip (x :: xs, y :: ys) = (x, y) :: zip (xs, ys)withtype {n:nat} ‘a list(n) * ‘b list(n) -> (‘a * ‘b) list(n) • The pattern matching clauses in the definition of zip is exhaustive: neither ([], _ :: _) nor (_ :: _, []) can have type ‘a list(n) * ‘b list(n) for any natural number n.
Exhaustiveness of Patterns • fun(‘a) nth_safe (0, x :: _) = x| nth_safe (i, _ :: xs) = nth_safe (i-1, xs)withtype {i:nat, n:nat | i < n} int(i) * ‘a list(n) -> ‘a • The pattern matching clauses are also exhaustive since …
Pat = (_, _) Pos = o.0 1 Pat = ([], _) Pos = o.1 Pat = (_ :: _, _) Pos = o.1 2 3 Pat = ([], []) Pat = ([], _ :: _) Pat = (_ :: _, []) Pat = (_ :: _, _ ::_) 4 5 6 7 Tag Check Elimination
Interpreter (I) • sort typ = Int | Bool | Fun of typ * typ • sort ctx = nil | :: of typ * ctx • datatype exp = Int of int | Bool of bool| Add of exp * exp | Sub of exp * exp| Eq of exp * exp | If of exp * exp * exp| One | Shift of exp | lam of exp| App of exp * exp | Fix of exp
Interpreter (II) • We can refine exp with a type indes expression of sort typ * ctx:Add: {c:ctx} exp(Int, c) * exp (Int, c) -> exp (Int, c)One:{t:typ,c:ctx} exp(t, t :: c))Shift:{ta:typ,tb:typ,c:ctx} exp(ta,c) -> exp(ta, tb :: c)Lam:{ta:typ,tb:typ,c:ctx} exp(tb, ta :: c) -> exp (Fun(ta, tb), c)…
Interpreter (III) • fun evaluate e = eval (e, [])withtype {t:typ} exp(t, nil) -> value(t)and eval (Zero e, env) = let val ValInt i = eval (e, env) in ValBool (i = 0) end… …
Untagged Representation • Obviously, there is no need for tags if we never do tag-checking on the values of a particular datatype • However, garbage collection makes things much more difficult
Conclusion • Dependent datatypes can more accurately model data structures • More program errors can be detected at compile-time • Code becomes more robust • This is a case when safer code runs faster