Type Checking and Type Inference

Type Checking and Type Inference L2-TypeInference

Motivation • Application Programmers • Reliability • Logical and typographical errors manifest themselves as type errors that can be caught mechanically, thereby increasing our confidence in the code execution. • Language Implementers • Storage Allocation (temporaries) • Generating coercion code • Optimizations L2-TypeInference

Typeless Assembly language Any instruction can be run on any data “bit pattern” Implicit typing and coercion FORTRAN Explicit type declarations Pascal Type equivalence Weak typing C Arrays (bounds not checked), Union type Actuals not checked against formals. Data Abstraction Ada, CLU Type is independent of representation details. Generic Types Ada, Java/C++/C# Compile-time facility for “container” classes. Reduces source code duplication. Evolution of Type System L2-TypeInference

Languages • Strongly typed (“Type errors always caught.”) • Statically typed (e.g., Ada, COOL and Eiffel) • Compile-time type checking : Efficient. • Dynamically typed (e.g., Scheme and Smalltalk) • Run-time type checking : Flexible. • Weakly typed (e.g., C) • Unreliable Casts (int to/from pointer). • Object-Oriented Languages such as C# and Java impose restrictions that guarantee type safety and efficiency, but bind the code to function names at run-time. L2-TypeInference

Type inference is abstract interpretation. ( 1 + 4 ) / 2.5 int * int -> int 5 / 2.5 (ML-error) (coercion) real * real -> real 2.0 ( int + int ) / real int / real real L2-TypeInference

Expression Grammar: Type Inference Example E -> E + E | E * E | x | y | i | j • Arithmetic Evaluation x, y in {…, -1.1, …, 2.3, …} i, j in {…, -1,0,1,…} +, * : “infinite table” • Type Inference x, y : real i, j : int L2-TypeInference

Values can be abstracted as type names and arithmetic operations can be abstracted as operations on these type names. L2-TypeInference

Type correctness is neither necessary nor sufficient for programs to run successfully. if true then “5” else 0.5 • Not type correct, but runs fine. if true then 1.0/0.0 else 3.5 • Type correct, but causes run-time error. L2-TypeInference

Assigning types to expressions • Uniquely determined fn s => s ^ “.\n”; val it = fn : string -> string • Over-constrained (type error without coercion) (2.5 + 2) • Under-constrained • Overloading fn x => fn y => x + y; (*resolvable *) fn record => #name(record); (* error *) • Polymorphism fn x => 1 ; val it = fn : 'a -> int L2-TypeInference

Type Signatures • fun rdivc x y = x / y rdivc : real -> real -> real • fun rdivu (x,y) = x / y rdivu : real * real -> real • fun plusi x y = x + y plusi : int -> int -> int • fun plusr (x:real,y) = x + y plusr : real * real -> real L2-TypeInference

Polymorphic Types • Semantics of operations on data structures such as stacks, queues, lists, and tables, are independent of the component type. • Polymorphic type system provides a natural representation of generic data structures without sacrificing type safety. • Polymorphism fun I x = x; I 5; I “x”; • for all typesa: I: a -> a L2-TypeInference

Composition val h = f o g; fun comp f g = let fun h x = f (g x) in h; “Scope vs Lifetime : Closure” comp: (a -> b) -> (l->a) -> (l->b) • Generality • Equality constraints L2-TypeInference

map-function fun map f [] = [] | map f (x::xs) = f x :: map f xs map: (a -> b) -> (a list) -> (b list) map (fn x => 0) [1, 2, 3] map (fn x => x::[]) [“a”, “b”] • list patterns; term matching. • definition by cases; ordering of rules L2-TypeInference

Conventions • Function application is left-associative. f g h = ( ( f g ) h ) • ->is right-associative. int->real->bool = int->(real->bool) • :: isright-associative. a::b::c::[] = a::(b::(c::[]) • Function application binds stronger than ::. f x :: xs = ( f x ) :: xs (a -> b) -> (b -> c)=/= (a -> b -> b -> c) L2-TypeInference

Polymorphic Type System L2-TypeInference

Goals • Allow expression of “for all types T” fun I x = x I : ’a -> ’a • Allow expression of type-equality constraints fun fst (x,y) = x fst : ’a *’b -> ’a • Support notion of instance of a type I 5 I : int -> int L2-TypeInference

Type constants int, bool, … Type variables ’a, …, ’’a, … Type constructors ->, *, … Principal type of an expression is the most general type, which ML system infers. Type Checking Rule An expression of a type can legally appear in all contexts where an instance of that type can appear. Polymorphic Type System L2-TypeInference

Signature of Equality = : ’a * ’a -> bool ? • Equality (“= ”) is not computable for all types. E.g., function values. • So types (’a,’b, …) are partitioned into • types that support equality (’’a, ’’b, …) • types that do not support equality. = : ’’a * ’’a -> bool • int, string, etc. are equality types. They are closed under cartesian product, but not under function space constructor. L2-TypeInference

Type Inference I (3,5) = (3,5) (int*int)->(int*int) (int*int) Principal type of I is the generalization of all possible types of its uses. Curry2 : (’a * ’b -> ’c) -> (’a -> ’b -> ’c) Uncurry2 : (’a -> ’b -> ’c) -> (’a * ’b -> ’c) L2-TypeInference

Subtle Points • The type of a use of a function must be an instance of the principal type inferred in the definition. • The type of a value is fixed. So, multiple occurrences of a symbol denoting the same value (such as several uses of a formal parameter in a function body) must have identical type. L2-TypeInference

Systematic Type Derivation fun c f g x = f(g(x)) • Step1:Assign most general types to left-hand-side arguments and the (rhs) result. f : t1 g : t2 x : t3 f(g(x)) : t4 Thus, type of c is: c: t1 -> t2 -> t3 -> t4 L2-TypeInference

Step 2: Analyze and propagate type constraints • Application Rule If f x : t then x : t1 and f : t1 -> t, for some new t1. • Equality Rule If both x:t and x:t1 can be deduced for the value of a variable x, then t = t1. • Function Rule (t->u)=(t1->u1)iff(t=t1) /\ (u=u1) L2-TypeInference

f(g x): t4 AR (g x) : t5 f : t5 -> t4 AR x : t6 g: t6 -> t5 ER t1 = t5 -> t4 t2 = t6 -> t5 t3 = t6 Step 3: Overall deduced type c : (t5 -> t4) -> (t6 -> t5) -> (t6 -> t4) (unary function composition) L2-TypeInference

Example fun f x y = fst x + fst y given op + : int ->int -> int fst : ’a * ’b -> ’a Step 1: x: t1 y : t2 (fst_1 x + fst_2 y) : t3 The two instantiations of fst need not have the same type. fst_1 : u1 * u2 -> u1 fst_2 : v1 * v2 -> v1 f: t1 -> t2 -> t3 L2-TypeInference

Step 2: Applying rule for + (fst_1 x) : int (fst_2 y) : int t3 = int • Step 3: fst_1 : int*u2 -> int fst_2 : int*v2 -> int t1 = int * u2 t2 = int * v2 f: int * ’a -> int * ’b -> int L2-TypeInference

Example (fixed point) fun fix f = f (fix f) 1. Assume f : ’a -> ’b 2. From (fix f = f (…)) infer fix : (’a -> ’b) -> ’b 3. From ( ... = f (fix f)) infer fix : (’a -> ’b) -> ’a fix : ’a -> ’a -> ’a L2-TypeInference

Recursive Definition (curried function) fun f x y = f (f x) 0; f: (int -> ’a) -> int -> ’a fun f x y = f (f x) (f x y); f: (’a -> ’a) -> ’a -> ’a fun f f = f; (* identity function *) (* names of formals and functions come from disjoint namespaces *) fun f x y = f x; (* illegal *) fun f g = g f; (* illegal *) L2-TypeInference

Example (ill-typed definition) fun selfApply f = f f 1. f_1:t1 (f_2 f_3):t2 selfApply : t1 -> t2 2. f_3 : t3 f_2 : t3 -> t2 3. t1 = t3 = t3 -> t2 (unsatisfiable) (cf. val selfApply = I I) L2-TypeInference

Problematic Case fn x => (1, “a”); val it = fn : 'a -> int * string fn f => (f 1, f “a”); Type error “Least upper bound” of int and string does not exist, and the expression cannot be typed as ((’a->’b) ->’b*’b) because the following expression is not type correct. ( (fn f => (f 1, f “a”)) (Real.Math.sqrt) ); L2-TypeInference

(cont’d) fn (x,y) => (size x, length y); val it = fn : string * 'a list -> int * int fn z => (size z, length z); Type Error string and list cannot be unified to obtain ’a -> (int * int). Or else, it will conflict with type instantiation rule, compromising type safety. L2-TypeInference

Equivalence? (let val V = E in F end) =/= ((fn V => F) E) Even though both these expressions have the same behavior, the lambda abstraction is more restricted because it is type checked independently of the context of its use. In particular, the polymorphic type variable introduced in an abstraction must be treated differently from that introduced in a local definition. L2-TypeInference

Type Checking and Type Inference