350 likes | 365 Views
Explore how Phantom Types provide additional type constraints to encode subtyping hierarchies in a modern functional language. Learn how to apply Phantom Types to addressing compile-time type errors effectively. Dive into deriving safe interfaces and implementations while preserving the type system's integrity.
E N D
Phantom Types and Subtyping Matthew Fluet Riccardo Pucella Dept. of Computer Science Cornell University
The Setting (I) • Modern strongly typed functional language • Parametric polymorphism • val id: = fn x => x • Type constraints • val idInt: intint = fn x => x • Datatypes and type constructors • datatype tree = Leaf of | Node of tree tree TCS2002
The Setting (II) • Modern strongly typed functional language • Limited expressibility at foreign function interfaces • No polymorphic types • No user defined datatypes • No primitive notion of subtyping TCS2002
The Problem datatype atom = I of Int | B of bool fun mkI (i:int):atom = I(i) fun mkB (b:bool):atom = B(b) fun toString (v:atom):string = ... fun double (v:atom):atom = ... fun conj (v1:atom, v2:atom):atom = ... toString (mkI 1) “1” toString (mkB false) “false” double (mkB true) run-time error conj (mkI 3, mkB true) run-time error TCS2002
Wish List • Raise compile-time type errors on domain violations rather than run-time errors • toString should apply to all atoms • double should only apply to integer atoms • conj should only apply to boolean atoms • Preserve the implementation • Would like to treat integer and boolean atoms as subtypes of all atoms TCS2002
A First Solution (I) type All, Int, Bool datatype atom = I of int | B of bool fun mkI (i:int):Intatom = ... fun mkB (b:bool):Boolatom = ... fun toString (v:Allatom):string = ... fun double (v:Intatom):Int atom = ... fun conj (v1:Boolatom, v2:Boolatom):Boolatom = ... double (mkB true) compile-time type error; Int atom Bool atom conj (mkI 3, mkB true) compile-time type error; Bool atom Int atom toString (mkI 1) compile-time type error; Int atom All atom toString (mkB false) compile-time type error; Bool atom All atom TCS2002
Phantom Types type All, Int, Bool datatype atom = I of int | B of bool fun mkI (i:int):Intatom = ... fun mkB (b:bool):Boolatom = ... • Phantom types • Abstract types that need not have any corresponding run-time values • Phantom type variables • Type instantiations of in atom do not contribute to the run-time representation of atoms TCS2002
A First Solution (II) type All, Int, Bool datatype atom = I of int | B of bool fun mkI (i:int):Intatom = ... fun mkB (b:bool):Boolatom = ... fun toString (v:Allatom):string = ... fun double (v:Intatom):Int atom = ... fun conj (v1:Boolatom, v2:Boolatom):Boolatom = ... fun intToAll (v:Int atom):All atom = v fun boolToAll (v:Bool atom):All atom = v toString (intToAll (mkI 1)) “1” toString (boolToAll (mkB false)) “false” TCS2002
A Better Solution type All, Int, Bool datatype atom = I of int | B of bool fun mkI (i:int):Intatom = ... fun mkB (b:bool):Boolatom = ... fun toString (v:atom):string = ... fun double (v:Intatom):Int atom = ... fun conj (v1:Boolatom, v2:Boolatom):Boolatom = ... double (mkB true) compile-time type error; Int atom Bool atom conj (mkI 3, mkB true) compile-time type error; Bool atom Int atom toString (mkI 1) well typed; Int atom unifies with atom toString (mkB false) well typed; Bool atom unifies with atom TCS2002
The Phantom Types Technique • Use a superfluous type variable and type constraints to encode “extra” information • Underlies many interesting uses of type systems • Foreign function interfaces • Embedded languages • Uncaught exception analysis • A “folklore” technique TCS2002
Contributions • A general encoding of subtyping hierarchies into phantom types • A formalization of one use of the phantom types technique TCS2002
Outline • A recipe for interfaces and implementations • Encoding subtyping hierarchies • Bounded polymorphism • Formalization TCS2002
Features of the example An underlying primitive type of values A set of operations A hierarchy of implicit subtypes mkI 1 Int int atom toString All string atom string All Int Bool From Subtyping to Polymorphism TCS2002
The Recipe • Given: • A primitive type p • An implicit subtyping hierarchy 1,…,n • An implementation of p and its operations • Derive • A “safe” interface (the types) • A “safe” implementation (the code) • Restrictions • Shared representation and operations TCS2002
Applying the Recipe • Given: • A primitive type: atom • An implicit subtyping hierarchy: All, Int, Bool • An implementation: structure Atom • Derive • A “safe” interface: signature SAFE_ATOM • A “safe” implementation: structure SafeAtom TCS2002
Deriving the Interface (I) • 1C unifies with 2A iff 12 • Introduce type • Encode each implicit type as • 1 unifies with 2 iff 12 • Example: • AllC = unit AllA = • IntC = int IntA = int • BoolC = bool BoolA = bool TCS2002
Deriving the Interface (II) • Use concrete encodings in all covariant type positions • Use abstract encodings in most contravariant type positions TCS2002
Deriving the Interface (III) signature ATOM = sig type atom val mkI: int -> atom val mkB: bool -> atom val toString: atom -> string val double: atom -> atom val conj: atom * atom -> atom end signature SAFE_ATOM = sig type atom val mkI: int -> IntCatom val mkB: bool -> BoolCatom val toString: AllAatom -> string val double: IntAatom -> IntCatom val conj: BoolAatom * BoolAatom -> BoolCatom end TCS2002
Applying the Recipe • Given: • An abstract type: atomp • An implicit subtyping hierarchy: All, Int, Boolp • An implementation: structure Atomp • Derive • A “safe” interface: signature SAFE_ATOM • A “safe” implementation: structure SafeAtom TCS2002
Deriving the Implementation (I) • Need a type isomorphic to p • the type system should consider 1 and 2 equivalent iff 1 and 2 are equivalent • Opaque signature constraint • Hides all type implementation details TCS2002
Deriving the Implementation (II) structure SafeAtom1:> SAFE_ATOM = struct type atom = Atom.atom val mkI = Atom.mkI val mkB = Atom.mkB val toString = Atom.toString val double = Atom.double val conj = Atom.conj end TCS2002
Applying the Recipe • Given: • An abstract type: atomp • An implicit subtyping hierarchy: All, Int, Boolp • An implementation: structure Atomp • Derive • A “safe” interface: signature SAFE_ATOM • A “safe” implementation: structure SafeAtom TCS2002
Encoding Subtyping Hierarchies (I) • Powerset lattice encoding • S = {s1,…,sn} is a finite set • Ordered by inclusion X S XC= t1 … tn where ti = unit if si X unit z otherwise XA= t1 … tn where ti = i if si X i z otherwise TCS2002
All = {s1, s2} Int = {s1} Bool = {s2} None = {} Encoding Subtyping Hierarchies (II) AllC = unit unit IntC = unit unit z BoolC = unit z unit NoneC = unit z unit z AllA = 1 2 IntA = 1 2 z BoolA = 1 z 2 NoneA = 1z 2 z TCS2002
Encoding Subtyping Hierarchies (III) • Any finite hierarchy can be embedded in the powerset lattice of a set S • Better encodings for specific classes of hierarchies TCS2002
Bounded Polymorphism • Extends both parametric polymorphism and subtyping • double: Int. • toString: All. string • plus: Int.( ) • Provides a connection between type instantiation and subtyping • We can safely encode a restricted form of bounded polymorphism using a simple extension of our recipe TCS2002
Formalization • Translation • From a language with a restricted form of bounded polymorphism • To a language with parametric polymorphism • Using the “recipe” given earlier • See paper for details TCS2002
Conclusion • Use type equivalence to encode information in a free type variable • Use unification to enforce a particular relation on the information • Practical issues • complexity of types TCS2002
The Problem (II) datatype atom = I of int | B of bool fun mkI (i:int):atom = I(i) fun mkB (b:bool):atom = B(b) fun toString (v:atom):string = case v of I(i) => Int.toString(i) | B(b) => Bool.toString(b) fun double (v:atom):atom = case v of I(i) => I (2 * i) | _ => raise (Fail “type mismatch”) fun conj (v1:atom, v2:atom):atom = case (v1,v2) of (B(b1),B(b2)) => B (b1 andalso b2) | _ => raise (Fail “type mismatch”) TCS2002
A Better Solution (II) type All = Int = Bool = unit datatype atom = I of int | B of bool fun mkI (i:int):Intatom = I(i) fun mkB (b:bool):Boolatom = B(b) fun toString (v:atom):string = case v of I(i) => Int.toString(i) | B(b) => Bool.toString(b) fun double (v:Intatom):Intatom = case v of I(i) => I (2 * i) | _ => raise (Fail “type mismatch”) fun conj (v1:Boolatom, v2:Boolatom):Boolatom = case (v1,v2) of (B(b1),B(b2)) => B (b1 andalso b2) | _ => raise (Fail “type mismatch”) TCS2002
Bounded Polymorphism (I) • Extends both parametric polymorphism and subtyping • . • .() TCS2002
Bounded Polymorphism (II) • IntA IntC • Example: NatInt • double:Int. • IntA IntA • where = IntA • plus: Int.( ) • where = IntA • plus (mkI 1, natToInt (mkN 2)) TCS2002
Bounded polymorphism (III) • Limitations • Type variable bounds • ..() • where = A and = A • First-class polymorphism • Functional subtyping • (1 2). 2 • 2C where = 1C 2A TCS2002
Formalization (II) let f1 = 11 . x:1. c1 x in … let fn = nn . x:n. cn x in [] “safe” interface types TCS2002