360 likes | 613 Views
A type-checking algorithm. The task: (since we start with empty H, why is the goal not just E?) The rule set (revisited next page) is algorithmic: The rules are syntax-directed For each expression, a unique potentially applicable rule For goals in body of a proper rule:
E N D
A type-checking algorithm The task: (since we start with empty H, why is the goal not just E?) The rule set (revisited next page) is algorithmic: The rules are syntax-directed • For each expression, a unique potentially applicable rule • For goals in body of a proper rule: • Environment determined by head goal environment • Expression is a direct sub-expression of head goal expression types
For deriving an algorithm, we make explicit all conditions in the rules: (the exists is not a problem here --- why?) Why are the conditions placed in these positions? types
The algorithm: types
For an expression w/o let the algorithm works in two stages: • From root of expression to leaves, determining the type environment for each sub-expression • From leaves to root, determining types of sub-expressions H H2 H1 types
For a let expression (or sub-expression): • Apply algorithm to defining expressions, to obtain their types, • Use these types in determining type environment for body H let types
Claim: [correctness of algorithm (w.r.t. rules)] Proof: by induction on E Since a well-typed expression has at most one type under an environment, the algorithm returns a unique type, or fails For a closed program E, the initial call is types
Comment: Type-checker is the component that discovers missing declarations -- free var error Example: (assume this entered as 1st line in an ocaml session) let f = fun n if n=0 then 1 else n*f(n-1); types
On implementation: A compiler constructs a symbol table: • Types for declared variables(from the declarations) • Hierarchical organization– reflects region hierarchy Type-checking of program phrase performed with respect to right place in the table. types
On the correctness of the rules A well-typed program never goes wrong Specifically: • Never generates a run-time type error • Halts only if reaches an extended value (progress) Together: type safety(some use safety only for the 1st) We prove also: • Never applies rule to argument of wrong type (according to the type declared for the function) In the typed language, this is also a type error types
erase We prove using transition semantics A problem: We deal with two worlds (TFL, FL) each with only some of the concepts. TFL: FL: Typed expressions Typing relation Well typed No Transition semantics Run-time type errors Untyped expressions No typings Transition semantics Run-time type errors types
The solution: a transition relation for TFL: same as but on typed expressions, except the diagram (almost) commutes: (except when?) progress holds for this language as well! Now, transitions, run-time errors apply to upper level -- TFL types
Observation: A run-time type error w.r.t. t is not well-typed: • v1 v2, where v1 is not an operation, nor a function, (hence (canonical forms) its type is not a function type) • If v e2 e3 where v is a non-boolean This covers all r-t type errors that have a transition to ER Corollary: An expression that contains a run-time type error (even not as its selected sub-expression) is not well-typed types
Assume WT(E), but E’s execution generates a type-error Q: How do we show this is impossible? A:we prove a type preservation property This property is the key to type safety How is it related to the informal intro to static type checking? types
A comment: We prove type preservation for regular expressions only; ER has no type (alternatives?) types
Theorem [Type preservation/ Subject reduction] (where E’ is not ER) Proof : induction on selection path of E • this path never goes to a lambda, hence H is fixed We show first the induction step (easier) then the basis (redexes) types
Induction step: case ofE • E= E1 E2 E’1 E2 (case for E2 issymmetric) • If – a step on the test, similar • Tuple – a step on a component, similar • Let – a step on a defining expression, similar Q: Where did we use inversion above? types
Basis: (redexes) types
The “difficult” case: function application Lemma: [type preservation under substitution] Intuitively simple, formally by induction on E End of proof of type preservation types
Back to original goal: Corollary (of preservation thm):[correctness of type rules] Corollary: types
Now, transfercorrectness to the FL setting Corollary: If E (in TFL) is well-typed, then • the FL execution of erase(E) will not generate a type error The execution may generate other errors, or be infinite (one reason natural semantics has not been used for the correctness proof) types
Discussion of TFL’s type system The type system is monomorphic • Each literal(number, boolean, operation name) has unique type(transferred to type checker by axiom (const)) • Variables have unique types(by declarations)\ (transferred to type checker by axiom (var)) • Each well-typed expression has a uniquetype (induction on expressions) • Each value (including tuples, functions) has a unique type(if its expression is well-typed) But, are all our assumptions satisfied in real pl’s? types
The operation name + may be associated with two operations (on int, real) (ad-hocpolymorphism) • The operation = has a polymorphic type: (an eq-type) • The constant nil (empty list) (not yet introduced) has a polymorphic type Q: is the type system still monomorphic? In what sense? types
On type equality (equivalence) The type checker uses type equality tests (where?) How is type equality defined? • By structure of the types --structural equivalence Types are equivalent if they have same structure • By name -- name equivalence • Type names are associated (once) to some structure • Types are equivalent only if they have same name Type systems that use `by name’ are called nominal They include: Pascal, Java types
Comments and Discussion Dynamic typing: • Only values(operations & functions included) are typed • Type compatibility determined at run-time (‘last minute’) • Types are general: (n-ary) function, list, • Functions can be applied to all arguments of right arity (define id (lambda (x) x) (id 3) (id id) • Collections may be heterogeneous types
Static - monomorphic: • Valuesare typed (most, especially functions, are uni-typed) • Cells are (uni-)typed (not seen in TFL) • Cells can be assigned only values of their type • variables,expressions, are statically typed (most expressions are uni-typed) • Types must be detailed • Operations, functions: both inand out types functions can be applied only to their in type • Collections: include the element type, hence must be homogeneous (& uni-typed) (exception: records) • Conditional expressions are conservatively typed types
Pro and Cons Dynamic: For : flexibility: • non-restrictive types, • no fixed types for cells, expressions, … Against : • increased overhead • extra storage, • extra run-time, • reduced safety (late discovery of many errors) types
Static (mono): For : • Increased performance • Reduced storage (no tags) (but modern pl’s may include tagsforother reasons) • Less run-time checks • Data structure storage and access optimized by type • increased safety • Improved documentation • Early discovery of many errors types
Against : • Conservative checking, more type errors some ok programs are rejected • Monomorphism, restrictive types (uni-typed functions, homogeneous collections) reduced flexibility, non-generic programs, lack ofreusability Examples: one needs to write • Uni-typed append functions, one for each type • Uni-typed search tree procedures for each type Since parameters must be declared, this cannot be avoided types
Many users do not accept these restrictions, and prefer dynamically typed language (e.g. scriptingpls) What are the possible solutions? The pl research/development community offers: Polymorphic type systems: types
Kinds of polymorphic type systems: • Parametric polymorphism (a-la ML) • Values, in particular functions, have many types, are reusable • Sub-type polymorphism (a-la OO) • Values, in particular functions have many types, are reusable • Collections can have elements of many types In last two decades, approaches to merging the two have been developed war between the dynamic and static schools is still active types
What next? • We extend FL and TFL with various constructs: recursion, cells, … (depends on available time) For each, we examine semantics and typing • We proceed to the environment model In the future, we discuss ML-style parametric polymorphism, hopefully also sub-type polymorphism types