380 likes | 515 Views
PROPERTIES OF A TYPE ABSTRACT INTERPRETATER. getting a better understanding of the relation between the type systems abstract interpretation approaches to property inference. MOTIVATION OF THE EXPERIMENT. a well understood case type inference in functional programming à la ML.
E N D
getting a better understanding of the relation between the • type systems • abstract interpretation • approaches to property inference MOTIVATION OF THE EXPERIMENT • a well understood case • type inference in • functional programming à la ML
type systems • Hindley’s monomorhic types • let-polymorphic types à la ML, with monomorphic recursion • Damas-Milner’s algorithm (DM) for type inference • polymorphic recursion THE STARTING POINT • abstract interpretation • Monsuez’s reconstruction [FSTTCS 92, SAS 93] as computation of an abstract semantics • Cousot’s systematic derivation by abstract interpretation [POPL 97] of a hierarchy of type systems and type inference algorithms
most of the technical results are taken from Cousot 97 • the concrete semantics is the collecting version of the denotational semantics of eager untyped l-calculus, without let and with (mutual) recursion • the abstract domain is the one defined for monomorphic types • terms (monotypes with variables) * idempotent substitutions • the abstract partial order is based on the instantiation relation on terms • t1t2 , if t2 is an instance of t1 • the abstract domain is non-Noetherian THE FOUNDATIONS OF OUR EXPERIMENT 1
the abstract semantics corresponding to the Damas-Milner type inference algorithm • which is the same as Hindley’s monomorphic types in a language with no “let” construct • is obtained in Cousot 97 by abstracting a polymorphic recursion semantics (à la Mycroft) • the last abstraction removes the fixpoint computation (which would be needed in the case of recursive functions) since there is no fixpoint computation in the DM algorithm • why? • the fixpoint computation might diverge since the abstract domain is non-noetherian • the DM algorithm can be understood (Monsuez 1992) as the application of a widening operator (based on unification) after the first iteration THE FOUNDATIONS OF OUR EXPERIMENT 2
theHindley and ML inference algorithms handle recursive definitions by means of a mechanism which can be explained in terms of a widening operator consisting of • 1 abstract fixpoint computation step • a unification • our algorithm handles recursion by performing • k abstract fixpoint computation steps • the unification, only if we have not reached the fixpoint • as done by Monsuez in the case of polymorphic types OUR CONTRIBUTION: THE ALGORITHM
if we perform just one fixpoint computation step (k = 1), we get exactly the ML result • for every k > 1, either we reach the least fixpoint • which is in general more precise than the ML result • or we get exactly the same result as ML • we get an improvement in precision only if we reach the least fixpoint in k steps • different from what claimed by Monsuez OUR CONTRIBUTION: THE RESULTS 1
the resulting abstract semantics lies between the Damas-Milner semantics and the Mycroft semantics in the Cousot’s hierarchy • the corresponding type system turns out to lie between monomorphic types and polymorphic recursion • the algorithm is much simpler than the one for polymorphic recursion • no need for quantification in type terms • no type generalization in the fixpoint computation, hence no polymorphic recursion • it succeeds in typing all the programs used to show the power of polymorphic recursion OUR CONTRIBUTION: THE RESULTS 2
the type abstract interpreter • the abstract semantics described by rules • the widenings • discussion and examples • which type system? • between monomorphic and polymorphic recursion • looking at the interpreter • a new simple and powerful type system? • conclusion • the type system approach to static analysis can profitably import techniques from abstract interpretation PLAN OF THE LECTURE
untyped eager l-calculus • x, f, … X: program variables • e, e1, … E: program expressions • e::= id x | lx.e | e1 + e2 | if e1 then e2 else e3 | int n | e1(e2) | mf. lx. e • in the Church/Curry monotype semantics the type of an expression e is a set of typings <H,m> stating that the standard evaluation of e in an environment • where the global variablesxhave typeH(x)given by the type environmentH • returns a value of type m (monotype) LANGUAGE AND MONOTYPES
Hindley’s algorithm finds the principal typing for the Church/Curry monotype system • an exact representation for all possible typings • Hindley’s algorithm uses the domain of Herbrand terms • monotypes with variables • reconstructed as an abstract interpreter by Cousot [POPL 97] • is the same as the ML and Damas-Milner algorithms on the fragment without let polymorphism MONOTYPES WITH VARIABLES
our implementation (in OCaML) of the abstract semantics, • http://www.di.unipi.it/~levi/typesav/pagina2.html • apart from the case of recursion, is essentially the DM algorithm implementation in Cousineau-Mauny 98 • mutual recursion in the examples only • ML’s notation for concrete syntax and types THE TYPE INFERENCE ABSTRACT INTERPRETER 1
the abstract semantics of an expression e in a given type environment H produces the type t of the expression e together with some constraint g on the type variables in H • the intuition is that the constraintgis how H should be instantiated in order for e to have the typet • the rules of the abstract type interpreter use • apply(g, t)= g(t) • unify(a=b, g1, g2) computes the solved form of the set of equations { a=b, x1i= t1i, x2i = t2i } • with x1i / t1ig1and x2i / t2ig2 THE TYPE INFERENCE ABSTRACT INTERPRETER 2
H |- int n (int, e) H |- id x H(x) H |- e1 (t1, g1) H |- e2(t2, g2) g = unify({t1=int, t2=int, g1, g2}) ---------------------------------------- H |- e1 + e2 (apply(g,t1),g) THE RULES 1
H |- e1(t1,g1) H |- e2(t2,g2) H |- e3 (t3,g3) g = unify({t1=int, t2= t3, g1, g2, g3}) ----------------------------------------------------- H |- if e1 then e2 else e3 (apply(g,t2),g) H |- e1 (t1, g1) H |- e2(t2, g2) g = unify({t1= f1f2, t2= f1, g1, g2}) ---------------------------------------- H |- e1 e2 (apply(g,f2),g) H[x (f1, e)] |- e (t, g)t1= apply(g,f1) ----------------------------------------------- H |- lx.e (t1 t,g) THE RULES 2
if we abstract systematically the concrete semantics, the abstract semantics of recursion should be an abstract fixpoint computation • H |- mf.lx.e Tn-1 (t1, g1)H |- mf.lx.e Tn (t2, g2) • (t1, g1) = (t2, g2) • ----------------------------------------------------- • H |- mf.lx.e (t2, g2) • H |- mf.lx.eTn-1(t1,g1)H |- (mf.lx.e,(t1,g1)) f(t,g) • ------------------------------------------------------------ • H |- mf.lx.e Tn (t,g) • H |- mf.lx.e T0 (f1,e) • H[f (t,g)] |- lx.e (t1,g1) • ----------------------------------------------- • H |- (mf.lx.e,(t,g)) f (t1,g1) THE RULES FOR RECURSION
we have infinite ascending chains • we are not guaranteed to be able to find a solution in a finite number of steps • the abstract computation for the expression mf.lx.f • does not terminate • we introduce a family of widening operators obtained by generalizing the operator introduced by the Hindley’s inference algorithm • used also in ML • # let rec f x = f;; • This expression has type 'a -> 'b but is here used with type 'b. RECURSION AND WIDENING
H |- mf.lx.ewidk(t,g) -------------------------- H |- mf.lx.e (t,g) H |- mf.lx.ewidk-1(t1,g1)H |- (mf.lx.e,(t1,g1)) f(t,g) (t1,g1) = (t,g) ------------------------------------------------------------ H |- mf.lx.e widk (t,g) H |- mf.lx.ewidk-1(t1,g1)H |- (mf.lx.e,(t1,g1)) f(t,g) not((t1,g1) = (t,g)) j = unify({t1g = t, g}) ------------------------------------------------------------ H |- mf.lx.e widk (apply(j,t1),j) H |- mf.lx.e wid0 (f1,e) H[f (t,g)] |- lx.e (t1,g1) ----------------------------------------------- H |- (mf.lx.e,(t,g)) f (t1,g1) THE FAMILY OF WIDENINGS
Theorem 1 the Hindley’s algorithm is equivalent to our abstract semantics, with the choice of k = 1 Theorem 2 therelationwidkdefines a family of widening operators which return a correct approximation of the abstract least fixpoint PROPERTIES OF OUR ABSTRACT SEMANTICS 1
widenings with more iterations might lead to more precise results as claimed by Monsuez false! Theorem 3 • if H |- mf.lx.e widh (t1,g1) • andH |- mf.lx.e widk (t2,g2) • then (t1,g1) (t2,g2) • unless one of the two is the least fixpoint PROPERTIES OF OUR ABSTRACT SEMANTICS 2
a function which cannot be typed by ML, from Cousot 97 • f f1 g n x = g(f1n(x)) EXAMPLE 1 # let rec f f1 g n x = if n=0 then g(x) else f(f1)(function x -> (function h -> g(h(x)))) (n-1) x f1;; This expression has type ('a -> 'a) -> 'b but is here used with type 'b. • we succeed in typing it with two more iterations, obtaining the type • val f:('a-> 'a) -> ('a-> 'b) -> int-> 'a-> 'b = <fun> • the inferred type is the least fixpoint • Cousot computes the same type using a polytype system á la Church-Curry
step0 t0 ='a1 g0 = e step1 t1 ='a5->('a4->'a2)->int->'a4->'a2 g1 = {'a1='a5->('a3->(('a3->'a4)->'a2))->int->'a4 ->('a5->'a2)} unification in widening fails {'a4='a3, 'a2=('a3->'a3)->'a2, …} This expression has type ('a -> 'a) -> 'b but is here used with type 'b. TRACE OF THE COMPUTATION 1
step1 t1 ='a5->('a4->'a2)->int->'a4->'a2 g1 = {'a1='a5->('a3->(('a3->'a4)->'a2))->int->'a4 ->('a5->'a2)} step2 t2 = ('a7->'a7)->('a7->'a6)->int->'a7->'a6 g2 = {'a2=('a7->'a7)->'a6} unification in widening fails {'a5='a7->'a7, 'a4='a7, 'a6=('a7->'a7)->'a6, …} TRACE OF THE COMPUTATION 2
step2 t2 = ('a7->'a7)->('a7->'a6)->int->'a7->'a6 g2 = {'a2=('a7->'a7)->'a6} step3 (fixpoint!) t3 =('a->'a)->('a->’b)->int->’a->’b g3 = {'a6=('a->'a)->'b, 'a7='a } TRACE OF THE COMPUTATION 3
ML’s typing • # let rec f x = (function x -> function y -> x) 0 (f 0);; • val f: int -> int = <fun> • we succeed in getting a more precise type with two more iterations • val f: 'a -> int = <fun> • Monsuez gets the same type by polymorphic recursion EXAMPLE 2 [Monsuez 92]
ML’s typing • # let rec p x = if q x = q 1 then p x else p x • and q x = p 1;; • val p: int -> 'a = <fun> • val q: int -> 'a = <fun> • we succeed in getting a more precise type with two more iterations • val p: 'a -> ’b = <fun> • val q: 'a -> ’b = <fun> • Monsuez gets the same type by polymorphic recursion EXAMPLE 3: mutual recursion [Monsuez 93]
the main difference between Hindley’s algorithm and our abstract type interpreter lies in the way we deal with recursive definitions • the examples show that we are more precise • why? WHICH TYPE SYSTEM
each function application needs to have the same type • such a type is exactly the type of the recursive function • in the Hindley’s algorithm each instantiation of the type of a recursive function produced by a function application has a direct effect in “guessing” the type of the recursive function • the instantiation produced by a function application has to produce the same instantiation on the type of the recursive function itself MONOMORPHIC TYPE SYSTEMS
each function application can have any type which is an instance of the recursive function type • the instantiation of the type of a recursive function in a function application does not produce the same instantiation on the type of the recursive function itself • different function applications of a recursive function can lead to different (possibly incompatible) instantiations of the recursive function type POLYMORPHIC TYPE SYSTEMS
a form of recursion which lies between monomorphism and polymorphic recursion • each function application can have any type which is an instance of the recursive function type as long as all these different instances are compatible • different function applications of a recursive function can lead to different (but compatible) instantiations of the recursive function type OUR TYPE SYSTEM
we collect all the constraints produced by the different function applications in each abstract fixpoint computation step • the instantiations must be compatible, i.e., they must produce a satisfiable constraint • we never apply the constraint to the type of the recursive function • the instantiation of the type of a recursive function in a function application does not produce the same instantiation on the type of the recursive function itself Inside The Fixpoint Computation
for all the types checked by the Church/Curry monotype system there exists a principal type computed by our abstract type interpreter • there are types which can be inferred by our abstract type interpreter which can not be checked by the Church/Curry monotype system • any type that can be inferred with our abstract type interpreter can indeed be checked by the Damas-Milner-Mycroft polytype system In The Middle
universally quantified types are inserted in the environment after having been generalized • all the free type variables are universally quantified • any instance by renaming and instantiating universally quantified variables whenever needed in a function application POLYMOPHIC RECURSION À LA Damas-Milner-Mycroft
our abstract domain is much simpler • it is the standard monomorphic types with variables domain without quantification • no type generalization • the Damas-Milner-Mycroft polytype system could check an expression where the recursive function is called with two incomparable instances • this case seems not to arise in meaningful programs • all the examples introduced as a motivation for polymorphic recursion can indeed be typed by our abstract interpreter OUR POLYMOPHIC RECURSION
by computing better approximations of the fixpoints (bounded iteration + possible widening) in the abstract semantics of recursive functions • we succeed in inferring more precise types • we solve some problems of the ML type inference algorithm without resorting to more complex type systems WHAT DO WE LEARN FROM THE EXAMPLES? • example 1 is typed by Cousot using a polytype system à la Church-Curry (and a fixpoint computation) • example 2 is typed by Monsuez (and Mycroft) using polymorphic recursion (and a fixpoint computation)
type systems are very important to handle a large class of properties • functional and object-oriented programming • calculi for concurrency and mobility • the type system directly reflects the property we are interested in • typing rules are easy to understand • it is often hard to move from the typing rules to the type inference algorithm • systematic techniques are needed • abstract interpretation provides some of these techniques FROM TYPE SYSTEMS TO TYPE INFERENCE
which information needs to be added to types in order to perform an accurate inference? • the DM algorithm adds idempotent substitutions • a kind of relational information [Monsuez 92], very often used in abstract domains, to achieve a better precision in the abstract computation • the theory of abstract interpretation provides techniques for the systematic refinement of domains, which can be very useful to systematically transform the property of interest into a good (possibly complete) abstract domain • how to cope with fixpoint approximation • abstract interpretation tells us when we can safely compute abstract fixpoints and when and how we should use widening operators FROM TYPE SYSTEMS TO TYPE INFERENCE
R. Gori and G. Levi. • An Experiment in Type Inference and Verification by Abstract Interpretation • Verification, Model Checking and Abstract Interpretation, • Third Int’l Workshop, VMCAI 2002, Venice • A. Cortesi, Ed., LNCS 2294, 225-239, 2002 • R. Gori and G. Levi. • Properties of a Type Abstract Interpreter • Verification, Model Checking and Abstract Interpretation, • Fourth Int’l Workshop, VMCAI 2003, New York • L. Zuck et al., Eds., LNCS 2575, 132-145, 2003 REFERENCES