220 likes | 380 Views
Types for Units of Measure. Andrew Kennedy Microsoft Research. Motivation. Types catch errors in programs Programs written in type-safe programming languages cannot crash the system But there are errors that (currently) they don’t catch: out-of-bounds array accesses division by zero
E N D
Types for Units of Measure Andrew Kennedy Microsoft Research
Motivation • Types catch errors in programs • Programs written in type-safe programming languages cannot crash the system • But there are errors that (currently) they don’t catch: • out-of-bounds array accesses • division by zero • taking the head of an empty list • adding a kilogram to a metre • Aim: to show that units of measure can be checked by types.
Approaches to the problem • Why not emulate using existing type systems? • Because there’s no support for the algebraic properties of units of measure (e.g. m s-1 = s-1 m) • Better: add special units types to the language and type-check w.r.t. algebraic properties of units. • But what about generic functions (e.g. mean and standard deviation of a list of quantities). What types do they have? • Better still: a polymorphic type system for units of measure.
Types with units • Floating-point types are parameterised on units e.g.m : kg num a : (m/s^2) numforce : (kg*m/s^2) num • Arithmetic operations respect units of measure e.g.force := m*a force := m+a
Units vs Dimensions • Dimensions are classes of interconvertible units e.g. • lb, kg, tonne are all instances of mass • m, inch, chain are all instances of length • Why not work with dimensions instead? • We could, but to support multiple systems of measurement we’d need to keep track of units anyway • To keep things simple, we stick to units and assume that distinct units are incompatible (so no conversions) • Assume some set of base units from which all others are derived (e.g. kg, m, s)
Polymorphism • Take ML/Haskell as our model, where functions are parametric in their types e.g.length : 'a list -> int map : ('a -> 'b) -> ('a list -> 'b list) • Parameterise on units instead of on types:mean : 'u num list -> 'u num variance : 'u num list -> ('u^2) num
Arithmetic • Now the built-in arithmetic functions can be assigned polymorphic types:+ : 'u num * 'u num -> 'u num- : 'u num * 'u num -> 'u num* : 'u num * 'v num -> ('u*'v) num/ : 'u num * 'v num -> ('u/'v) num< : 'u num * 'u num -> boolsqrt : ('u^2) num -> 'u numsin : 1 num -> 1 num (angles are dimensionless ratios) • Curiosity: zero should be polymorphic (it can have any units), all other constants are dimensionless (they have no units).
Formalising the type system • In the usual way, taking ML and adding: • syntax for base units (baseunit) and unit variables (unitvar) • a new syntactic category of units:unit ::= 1 | unitvar | baseunit | unit *unit | unit -1 • other powers of units and the “/” syntax as derived forms • equations that define the algebra of units as an Abelian group:u * v = v * u (commutativity)u * (v * w) = (u * v) * w (associativity)u * 1 = u (identity)u * u-1 = 1 (inverses)
New typing rules • Just add:where =E is the equational theory of Abelian groups over units of measure lifted to type expressions. • Extend the usual rules for polymorphism introduction & elimination to quantify over unit variables.
Type inference • ML and Haskell have the convenience of decidable type inference: if the programmer omits types then the type checker can infer them e.g.fun head (x::xs) = xis assigned the type'a list -> 'a • Fortunately the same is true for units-of-measure types e.g.fun derivative (h,f) = fn x => (f(x+h) - f(x-h)) / (2.0*h)is assigned the type'u num * ('u num -> 'v num) -> ('u num -> ('v/'u) num)
Principal types • The units type system has the principal types property: • if e is typeable there exists type τ such that all valid types can be derived by substituting unit expressions for unit variables in τ. • moreover, the inference algorithm will find the principal type. • If type checking e produces a type that instantiates to τwrite • The correctness of the algorithm is expressed as: • Soundness: • Completeness:
The inference algorithm • ML type inference uses a unification algorithm that, for any two types τ1 and τ2 finds a unifying substitution S (if one exists) such that S(τ1) = S(τ2). Moreover, it will find the most general unifier. • We want more: unification under the equational theory of Abelian groups. • Fortunately, unification in this theory is unitary (mgu’s exist) and decidable (there’s an algorithm to find them).
The inference algorithm, cont. • Unfortunately, that’s not all. To typelet x = e1 in e2usually one finds a type for e1 and then quantifies on the variables that are free in its type but not present in the type environment. • This is sound but not complete for inference of units of measure. • Fix: first ‘normalise’ the types in the type environment, a procedure akin to a ‘change of basis’.
Normal forms for types • In ML, the principal type can be presented in more than one way, but only with respect to the names of type variables e.g. 'a * 'b -> 'a 'f * 's -> 'f • Principal units types can have many non-trivial equivalent forms e.g. 'u num * ('u num -> 'v num) -> ('u num -> ('v/'u) num) 'u num * ('u num -> ('u*'v) num) -> ('u num -> 'v num) • It’s desirable to present types consistently to the programmer. Fortunately every type has a normal form that corresponds to the Hermite Normal Form from linear algebra.
Semantics of polymorphism • The polymorphic type of a function says something about the behaviour of the function. This is idea has become known as parametricity and is very powerful. • Examples: • if f : 'a list -> int then f cannot “look” at the elements of the list, so f (xs) = g (length(xs)) for some g : int -> int. • if f : 'a -> 'a then f must be the identity function (or else it diverges or raises an exception). • there are no total functions with type int -> 'a • in the polymorphic lambda calculus,T is isomorphic to (T -> 'a) -> 'a.
Semantics of units polymorphism • Parametricity is about “representation independence”: a polymorphic function is invariant under changes to the representation of its polymorphic arguments. • The analogue for units is “dimensional invariance”: a polymorphic function is invariant under changes to the units of measure used for its polymorphic arguments. • The idea can be formalised using binary relations to give a parametricity theorem in the style of Reynolds.
Consequences • “Theorems for free” e.g. • if f : 'u num -> ('u^2) numthen f (k*x) = k*k*f(x)for any positive k • There are types for which no non-trivial expressions can be defined e.g. • if only the usual arithmetic primitives are available (+ - * /), then any expression of type ('u^2) num -> 'u num does not return a non-zero result for any argument.
Dimensional analysis • Old idea: given some physical system with known variables but unknown equations, use the dimensions of the variables to determine the form of the equations. Example: a pendulum. l period t θ m g
Worked example • Pendulum has five variables: mass m M length l L gravity g LT-2angle θ none time period t T • Assume some relation f(m, l, g, θ, t) = 0 • Then by dimensional invariance f(Mm, Ll, LT-2g, θ, Tt) = 0 for any "scale factors" M,L,T • Let M=1/m, L=1/l, T=1/t, so f(1,1,t2g/l, θ, 1) = 0 • Assuming a functional relationship, we obtain
Dimensional analysis, formally • “Pi Theorem”.Any dimensionally-invariant relation f(x1,…,xn )=0for dimensioned variables x1,…,xn whose dimension exponents are given by an m by n matrix A is equivalent to some relationg(P1,…,Pn-r )=0where r is the rank ofA and P1,…,Pn-rare dimensionless products of powers of x1,…,xn.Proof: Birkhoff.
Dimensional analysis • New idea: express Pi Theorem by isomorphisms between polymorphic functions of several arguments and dimensionless functions of fewer arguments. e.g. 'M num * 'L num * ('L/'T^2) num * 1 num -> ('T^2) numis isomorphic to 1 num -> 1 num
What's left? • Extending the system to support multiple systems of units and automatic insertion of unit conversions • Generalising some of the semantic results (e.g. a Pi Theorem at higher types) • Using the parametric relations to construct a model of the language that accurately reflects semantic equivalences (i.e. is fully abstract wrt underlying semantics) • Practical implementaion in real languages