1 / 32

Typed Compilation of Recursive Datatypes

Typed Compilation of Recursive Datatypes. Joseph C. Vanderwaart, Derek Dreyer, Leaf Petersen, Karl Crary, Robert Harper, and Perry Cheng Carnegie Mellon University TLDI 2003. SML Datatypes. Elegant mechanism for defining recursive variant types, such as:

bebe
Download Presentation

Typed Compilation of Recursive Datatypes

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Typed Compilation ofRecursive Datatypes Joseph C. Vanderwaart, Derek Dreyer, Leaf Petersen, Karl Crary, Robert Harper, and Perry Cheng Carnegie Mellon University TLDI 2003

  2. SML Datatypes • Elegant mechanism for defining recursive variant types, such as: datatype intlist = Nil | Cons of int * intlist • Important that constructor applications and pattern matching should be implemented efficiently • Subject of this talk: • How to implement SML datatypes efficiently in a type-preserving compiler

  3. Formal Framework • Harper and Stone’s type-theoretic interpretation of Standard ML: • “Elaborates” SML programs into a type theory • Reasons for using HS: • Models first phase of type-preserving compiler, in particular the TILT compiler (developed at CMU) • Can explain datatype semantics in terms of type theory

  4. Overview • Three interpretations of datatypes: • Harper-Stone interpretation • Transparent interpretation • Coercion interpretation • Comparison on three axes: • Efficiency • Fidelity to the Definition of SML • Meta-theoretic complexity

  5. The Harper-Stone Interpretation

  6. Datatype Semantics • SML datatypes are generative: • Identical datatype declarations in separate modules yield distinct (abstract) types • HS elaborates datatypes as modules providing: • The datatype itself defined as a recursive sum type • Functions to construct and destruct values of the datatype • HS models generativity by “sealing” the datatype module with an abstract signature

  7. ExpDec Example datatype exp = VarExp of var | LetExp of dec * exp and dec = ValDec of var * exp | SeqDec of dec * dec VarExp(v) ¼ “v” LetExp(d,e) ¼ “let d in e” ValDec(v,e) ¼ “val v = e” SeqDec(d1,d2) ¼ “d1; d2”

  8. ExpDec Implementation structure ExpDec :> EXPDEC = struct type exp = m1(a,b).(var + b * a, var * a + b * b) type dec = m2(a,b).(var + b * a, var * a + b * b) fun exp_in x = rollexp(x) fun exp_out x = unrollexp(x) fun dec_in x = rolldec(x) fun dec_out x = unrolldec(x) end

  9. ExpDec Interface signature EXPDEC = sig type exp type dec val exp_in : var + (dec * exp) -> exp val exp_out : exp -> var + (dec * exp) val dec_in : (var * exp) + (dec * dec) -> dec val dec_out : dec -> (var * exp) + (dec * dec) end

  10. Elaborating Constructor Calls • Client of the datatype does the injection into the sum,then calls the datatype’s “in” function: VarExp(v) ÃExpDec.exp_in(inj1(v)) LetExp(d,e) ÃExpDec.exp_in(inj2(d,e)) ValDec(v,e) ÃExpDec.dec_in(inj1(v,e)) SeqDec(d1,d2) ÃExpDec.dec_in(inj2(d1,d2)) • But the cost of function calls to the in functions is too expensive.

  11. Inlining the Constructor Calls • We would like to inline the roll’s to avoid calling the exp_in and dec_in functions: VarExp(v) ÃrollExpDec.exp(inj1(v)) LetExp(d,e) ÃrollExpDec.exp(inj2(d,e)) ValDec(v,e) ÃrollExpDec.dec(inj1(v,e)) SeqDec(d1,d2) ÃrollExpDec.dec(inj2(d1,d2)) • But the definitions of exp and dec are not known outside of ExpDec, so inlining the roll’s is ill-typed!

  12. Separate Compilation • Not a problem if client of datatype defined in same compilation unit: • Unseal the datatype )roll’s become well-typed • Is a problem if client of datatype is defined in separately compiled module: • Datatype is an abstract import of client • Can’t assume knowledge of implementation • Similar problem for datatypes in functor arguments

  13. A Transparent Interpretation

  14. Making Datatypes Transparent • Expose the implementation of a datatype as a recursive sum type in its interface: signature EXPDEC = sig type exp = m1(a,b).(var + b * a, var * a + b * b) type dec = m2(a,b).(var + b * a, var * a + b * b) (* in and out function specs as before *) end • Inlining calls to the in and out functions is now well-typed outside of ExpDec

  15. Implications of Transparency • Datatypes are no longer generative • Identically defined datatypes are “visibly” equal • More types are equivalent, more programs may typecheck • Matching a datatype specification is harder • To match a datatype spec, a datatype must now be implemented as a particular recursive sum type • Depending on how you define recursive type equivalence, fewer programs may typecheck!

  16. Transparent Matching Example struct datatype exp = VarExp of var | LetExp of dec * exp and dec = ValDec of var * exp | SeqDec of dec * dec end :> sig type exp datatype dec = ValDec of var * exp | SeqDec of dec * dec end ?

  17. Transparent Matching Example struct type exp = m1(a,b).(var + b * a, var * a + b * b) type dec = m2(a,b).(var + b * a, var * a + b * b) end :> sig type exp datatype dec = ValDec of var * exp | SeqDec of dec * dec end ?

  18. Transparent Matching Example struct type exp = m1(a,b).(var + b * a, var * a + b * b) type dec = m2(a,b).(var + b * a, var * a + b * b) end :> sig type exp type dec = m1(b).(var * exp + b * b) end ?

  19. ? = Transparent Matching Example struct type exp = m1(a,b).(var + b * a, var * a + b * b) type dec = m2(a,b).(var + b * a, var * a + b * b) end :> sig type exp type dec = m1(b).(var * exp + b * b) end ?

  20. Notation • Use  to stand for a recursive type, i.e.: d ::= mk(a1,...,an).(t1,...,tn) (k 2 1..n) • Expansion of a recursive type: expand(d)For example, if intlist = m a. 1 + int * a then expand(intlist) = 1 + int * intlist

  21. Iso-Recursive Types • Iso-recursive equivalence is purely structural: • d¹ expand(d), but the two are isomorphic • rolld : expand(d) !d • unrolld : d! expand(d) • Works fine for H-S with abstract datatypes, but…

  22. Transparent Matching Example struct type exp = m1(a,b).(var + b * a, var * a + b * b) type dec = m2(a,b).(var + b * a, var * a + b * b) end :> sig type exp type dec = m1(b).(var * exp + b * b) end ? X

  23. Equi-Recursive Types • Another form of recursive type equivalence: • d = expand(d) • ma.t(a) represents unique solution of a = t(a) • d = ma.t(a) iff d = t(d) • Equi-recursive equivalence is sufficient: • dec matches its specification • Enables transparent interpretation to accept all valid SML datatype matchings

  24. A Hybrid Equivalence • Equi-recursive equivalence is overkill: • Unnecessary to equate a recursive type with a non-recursive type (its expansion) • Hybrid of iso- and equi-recursive equivalence: • Based on FLINT intermediate lang. [League and Shao] • Restriction of Amadio-Cardelli algorithm • Only equates d’s with d’s • Paper gives details of the hybrid algorithm, along with formal argument that it is sufficient

  25. Complications • Strong versions of type equivalence not well studied outside simply typed -calculus. (TILT IL’s have h.-o. constructors, singleton kinds…) • Conflicts with SML semantics: • Datatypes no longer generative. • Problems involving datatypes in sharing andwheretype constraints. • To implement SML, must handle these issues another way.

  26. The Coercion Interpretation

  27. Those in and out Functions • Recall the definitions given during elaboration: fun in(x) = roll(x) fun out(x) = unroll(x) • Consider the roll and unroll operations. • Commonly implemented as “no-ops”. That is, the values v and roll(v) are represented the same. So, roll and unroll are just “retyping” operators, or coercions. • Untyped machine code for in/out same as for the identity function.

  28. -> -> -> -> • At runtime,exp_in, exp_out act as the identity, but: • Cannot be recognized from the type ExpDec Revisited signature EXPDEC = sig type exp type dec val exp_in : var + (dec * exp) exp val exp_out : exp var + (dec * exp) val dec_in : (var * exp) + (dec * dec) dec val dec_out : dec (var * exp) + (dec * dec) end ) ) ) ) • New type constructor: t1)t2 • Inhabited only by coercive terms • Coerciveness of exp_in, exp_out reflected in type • Applications can be ignored at runtime

  29. Coercions • New constructs for the internal language: • Coercion values fold/unfold replace rolld/unrolld • Special type 1)2 distinguishes them from functions. • Special application syntax: v @ e • Define in/out using coercions val in : expand(d) )d = fold val out : d ) expand(d) = unfold • Define constructor app’s using coercion app’s VarExp(x) Ã ExpDec.exp_in@(inj1(x))

  30. Coercion Erasure • Why are coercion applications better than function applications? Because: • A closed value of coercion type can only be fold or unfold. • No work is required at run time to apply either fold or unfold. • To compile v@e, generate the same code as for e. • Safety argument (in the paper) • Formalized via a translation into an untyped target calculus.

  31. Performance • Run times of benchmarks under 3 interpretations. • Harper-Stone ¼ 37% slower than the others • Coercion interpretation about the same as transparent. • Coercion interpretation is faithful to SML semantics, requires only simple extension to the type theory.

  32. Conclusion

More Related