Parametric Polymorphism

Parametric Polymorphism Antonio Cisternino Giuseppe Attardi Universitàdi Pisa

Parametric Polymorphism • C++ templates implement a form of parametric polymorphism • PP is implemented in many flavors and many languages: Eiffel, Mercury, Haskell, ADA, ML, C++, … • Improve the expressivity of a language • May improve the performance of programs • It is a form of Universal polymorphism

C++ templates and macros • Macros are dealt by the preprocessor • C++ templates are implemented on the syntax tree • The instantiation strategy is lazy • The following class compiles unless the method foo is used: template <class T>class Foo { T x; intfoo() { return x + 2; } }; Foo<char*> f; f.x = “”; f.foo();

A more semantic approach • Parametric polymorphism has been introduced also in Java and C# • Java Generics and Generic C# for .NET • In both cases the compiler is able to check parametric classes just looking at their definition • Parametric types are more than macros on AST • Syntax for generics is similar

Generics in a Nutshell • Type parameterization for classes, interfaces, and methods e.g. class Set<T> { ... } // parameterized classclass Dict<K,D> { ... } // two-parameter classinterface IComparable<T> { ... } // parameterized interfacestruct Pair<A,B> { ... } // parameterized struct (“value class”) T[] Slice<T>(T[] arr, int start, int count) // generic method • Very few restrictions on usage: • Type instantiations can be primitive (only C#) or class e.g. Set<int> Dict<string,List<float>> Pair<DateTime, MyClass> • Generic methods of all kinds (static, instance, virtual) • Inheritance through instantiated types e.g.class Set<T> : IEnumerable<T>class FastIntSet : Set<int> In GJ is <T> T[] Slice(…) Virtual methods only in GC#!

More on generic methods • Generic methods are similar to template methods in C++ • As in C++ JG tries to infer the type parameters from the method invocation • C# requires specifying the type arguments • Example: template <class T> T sqr(T x) { return x*x; } std::cout << sqr(2.0) << std::endl; class F { <T> static void sort(T[] a) {…} } String[] s; F.sort(s); class F { static void sort<T>(T[] a) {…} } string[] s; F.sort<string>(s); C++ JG C#

Generic Stack class Stack<T> { private T[] items; private intnitems; Stack<T>() { nitems = 0; items = new T[] (50); } T Pop() { if (nitems == 0) throw Empty(); return items[--nitems]; } boolIsEmpty() { return (nitems == 0); } void Push(T item){ if (items.Length == nitems) { T[] temp = items; items = new T[nitems*2]; Array.Copy(temp, items, nitems); } items[nitems++] = item; } } How does the compiler check the definition?

The semantic problem • The C++ compiler cannot make assumptions about type parameters • The only way to type-check a C++ class is to wait for argument specification (instantiation): only then it is possible to check operations used (i.e. comp method in sorting) • From the standpoint of the C++ compiler semantic module all types are not parametric

Checking class definition • To be able to type-check a parametric class just looking at its definition the notion of bound is introduced • Like method arguments have a type, type arguments are bound to other types • The compiler will allow to use values of such types as if upcasted to the bound type • Example: class Vector<T: Sortable> • Elements of the vector should implement (or inherit from) Sortable

Example interface Sortable<T> { intcompareTo(T a); } class Vector<T: Sortable<T>> { T[] v; intsz; Vector() { sz = 0; v = new T[15]; } void addElement(T e) {…} void sort() { … if (v[i].compareTo(v[j]) > 0) … } } Not possible in Java, because Sortable is an interface and type T is lost. Compiler can type-check this because v contains values that implement Sortable<T>

Pros and Cons • A parameterized type is checked also if no instantiation is present • Assumptions on type parameters are always explicit (if no bound is specified Object is assumed) • Is it possible to make assumptions beyond bound? • Yes, you can always cheat by upcasting to Object and then do whatever you want: class Foo<T : Button> { void foo(T b) { String s = (String)(Object)b; } } • Still the assumption made by the programmer is explicit

Implementation • Different implementations of parametric polymorphism: • C++ only collects definition at class definition; code is produced at first instantiation • Java deals with generic types at compile time: the JVM is not aware of parametric types • C# exploits support by the CLR (Common Language Runtime) of parametric types • the CIL (Common Intermediate Language) has special instructions for dealing with type parameters

Java Generics strategy • The Java compiler verifies that generic types are used correctly • Type parameters are dropped and the bound is used instead; downcasts are inserted in the right places • Generated code is a normal class file unaware of parametric polymorphism

Example class Vector<T> { T[] v; int sz; Vector() { v = new T[15]; sz = 0; } <U implements Comparer<T>> void sort(U c) { … c.compare(v[i], v[j]); … } } … Vector<Button> v; v.addElement(new Button()); Button b = v.elementAt(0); class Vector { Object[] v; intsz; Vector() { v = new Object[15]; sz = 0; } void sort(Comparer c) { … c.compare(v[i], v[j]); … } } … Vector v; v.addElement(new Button()); Button b = (Button)b.elementAt(0);

Wildcard class Pair<X,Y> { X first; Y second; } public String pairString(Pair<?, ?> p) { return p.first + “, “ + p.second; }

Expressivity vs. efficiency • JG does not improve execution speed; though it helps to express genericity better than inheritance • Major limitation in JG expressivity: exact type information is lost at runtime • All instantiations of a generic type collapse to the same class • Consequences are: no virtual generic methods and pathological situations • Benefit: old Java < 5 classes could be seen as generic types! Reuse of the existing codebase

Generics and Java System Feature

Compilation Strategies • Java compiler compiles to Java bytecode: • Java bytecode is loaded and compiled at run time by the JIT + HotSpot • C# (and other CLR) compilers generate CIL code which is compiled to binary at load time

Problem with Java Generics Stack<String> s = new Stack<String>(); s.push("Hello"); Stack<Object> o = s; Stack<Button> b = (Stack<Button>)o; // Class cast exception Button mb = b.pop(); Cast authorized: both Stack<String> and Stack<Button> map to class Stack

Type Erasure ArrayList<Integer> li = new ArrayList<Integer>(); ArrayList<Float> lf = new ArrayList<Float>(); if (li.getClass() == lf.getClass()) { // evaluates to true System.out.println("Equal"); }

Generic C# • The CLR supports parametric types • In IL placeholders are used to indicate type arguments (!0, !1, …) • When the program needs an instantiation of a generic type the loader generates the appropriate type • The JIT can share implementation of reference instantiations (Stack<String> has essentially the same code of Stack<Object>)

Generic C# compiler • Template-like syntax with notation for bounds • NO type-inference on generic methods: the type must be specified in the call • The compiler relies on GCLR to generate the code • Exact runtime types are granted by the CLR so virtual generic methods are allowed • All type constructors can be parameterized: struct, classes, interfaces and delegates.

Example using System; namespace n { public class Foo<T> { T[] v; Foo() { v = new T[15]; } public static void Main(string[] args) { Foo<string> f = new Foo<string>(); f.v[0] = "Hello"; string h = f.v[0]; Console.Write(h); } } } .field private !0[] v .method private hidebysig specialnamertspecialname instance void .ctor() cil managed { .maxstack 2 ldarg.0 call instance void [mscorlib]System.Object::.ctor() ldarg.0 ldc.i4.s 15 newarr !0 stfld !0[] class n.Foo<!0>::v ret } // end of method Foo::.ctor

Performance of CLR Generics • Despite instantiation being performed at load time, the overhead is minimal • Code sharing reduces instantiations, improving execution speed • A technique based on dictionaries is employed to keep track of previous instantiated types

Expressive power of Generics • System F is a typed -calculus with polymorphic types • While Turing-equivalence is a trivial property of programming languages, for a type-system being equivalent to System F it is not • Polymorphic languages such as ML and Haskell cannot fully express System F (both languages have been extended to fill the gap) • System F can be transposed into C# http://www.cs.kun.nl/~erikpoll/ftfjp/2002/KennedySyme.pdf

Liskov Substitution Principle • Sub-Typing/Sub-Classing defines the class relation “B is a sub-type of A”, marked B <: A. • According to the substitution principle,if B <: A, then an instance of B can be substituted for an instance of A. • Therefore, it is legal to assign an instance bof B to a variable of type A A a = b;

Inheritance as Subtyping • Simple assumption: • If class B derives from class A then: B <: A

Generics and Subtyping • Do the rules for sub-types and assignment work for generics? If B <: A, then G<B> <: G<A>? Counter example List<String> ls = new List<String>(); List<Object> lo = ls; // Since String <: Object, so far so good. lo.add(new Object()); String s = (String)ls.get(0); // Error! The rule B <: A  G<B> <: G<A> defies the principle of substitution!

Other example class B extends A { … } class G<E> { public E e; } G<B> gb = new G<B>(); G<A> ga = gb; ga.e = new A(); B b = gb.e; // Error! Given B <: A, and assuming G<B> <: G<A>, then: G<A> ga = gb; would be legal. In Java, type is erased.

Bounded Wildcard A wildcard does not allow doing much To provide operations with wildcard types, one can specify bounds: Upper Bound The ancestor of unknown:G<? extends X>Java G<T> where T : XC# Lower Bound The descendant of unknown:G<? super Y>Java G<T> where Y : TC#

Bounded Wildcards Subtyping Rules For any B such that B <: A: • G<B> <: G<? extends A> • G<A> <: G<? super B>

Bounded Wildcards - Example G<A> ga = new G<A>(); G<B> gb = new G<B>(); G<? extends A> gea = gb; // Can read from A a = gea.e; G<? super B> gsb = ga; // Can write to gsb.e = new B(); G<B> <: G<? extends A> hence legal G<A> <: G<? super B> hence legal

Wildcard subtyping in Java By Vilhelm.s - CC BY-SA 3.0

Generics and Polymorphism class Shape { void draw() {…} } class Circle extends Shape { void draw() {…} } class Rectangle extends Shape { void draw() {…} } public void drawAll(Collection<Shape> shapes) { for (Shape s: shapes) s.draw(); } • Does not work. Why? • Cannot be used on Collection<Circle>

Bounded Polymorphism • Bind the wildcard: replace the type Collection<Shape> with Collection<? extends Shape>: public void drawAll(Collection<? extends Shape> shapes) { for (Shape s: shapes) s.draw(); } • Now drawAll() will accept lists of any subclass of Shape • The ? Stands for an unknown subtype of Shape • The type Shapeis the upper bound of the wildcard

Bounded Wildcard • There is a problem when using wildcards: public void addCircle(Collection<? extends Shape> shapes) { shapes.add(new Circle()); } • What will happen? Why?

Covariance, Contravariance, Invariance Given types A and B such that B <: A, a type constructor G is said: • Covariant: if G<B> <: G<A> • Contravariant: if G<A> <: G<B> • Invariant: if neither covariant nor contravariant

C# Variance Declaration interface IEnumerator<out T> { T Current { get; } bool MoveNext(); } public delegatevoid Action<inT>(T obj); Action<Shape> b = (shape) => { shape.draw(); }; Action<Circle> d = b; // Action<Shape> <: Action<Circle> d(new Circle()); Action<Object> o = b; // illegal • A covariant type parameter can be used as the return type of a delegate • A contravariant type parameters can be used as parameter types The type of a result is covariant a function argument is contravariant

Substitutability Principle • If S is a subtype of T, then objects of type T may be replaced with objects of type Swithout altering any of the desirable properties of that program (e.g. correctness).

Liskov Substitution Principle Let (x) be a true property of objects x of type T. Then (y) should be true for objects y of type S where S is a subtype of T. • Behavioral subtyping is a stronger notion than nominal or structural subtyping

Nominal Subtyping (Duck Typing) • If objects of class Acan handle all of the messages that objects of class Bcan handle (that is, if they define all the same methods), then A is a subtype of B regardless of inheritance. If it walks like a duck and swims like a duck and quacks like a duck, I call it a duck.

Structural Subtyping • In structural typing, an element is considered to be compatible with another if, for each feature within the second element's type, there is a corresponding and identical feature in the first element's type. • Subtype polymorphism is structural subtyping • Inheritance is not subtyping in structurally-typed OO languages: • if a class defines a methods that takes arguments or returns values of its own type

Liskov Signature Requirements • Matching function or method types involves deciding on subtyping on method signatures • Methods argument types must obey contravariance • Return types must obey covariance

Examples • Assuming • Cat <: Animal • Enumerable<T> is covariant on T • Action<T> is contravariant on T • Enumerable<Cat> is a subtype of Enumerable<Animal>. The subtyping is preserved. • Action<Animal> is a subtype of Action<Cat>. The subtyping is reversed. • Neither List<Cat> nor List<Animal> is a subtype of the other, because List<T> is invariant on T.

A Typical Violation • Class Square derived from class Rectangle, if for example methods from class Rectangle are allowed to change width/height independently

Contravariance of Arguments Types public class SuperType{ public virtual string AgeString(short age) { return age.ToString(); } } public class LSPLegalSubType : SuperType{ public override string AgeString(int age) { // This is legal due to the Contravariance requirement //widening the argument type is allowed return age.ToString(); } } public class LSPIllegalSubType : SuperType{ public override string AgeString(byte age) { // illegal due to the Contravariance requirement return base.AgeString((short)age); } }

Covariance of return Type public class SuperType{ public virtual intDaysSinceLastLogin(User user) { return int.MaxValue; } } public class LSPLegalSubType : SuperType{ public override short DaysSinceLastLogin(User user) { return short.MaxValue; // Legal because it will always fit into an int } } public class LSPIllegalSubType : SuperType{ public override long DaysSinceLastLogin(User user) { return long.MaxValue; // Illegal because it will not surely fit into an int } }

To wildcard or not to wildcard? • That is the question: interface Collection<E> { public booleancontainsAll(Collection<?> c); public booleanaddAll(Collection<? extends E> c); } interface Collection<E> { public <T> booleancontainsAll(Collection<T> c); public <T extends E> booleanaddAll(Collection<T> c); }

Lower Bound Example interface sink<T> { flush(T t); } public <T> T flushAll(Collection<T> col, Sink<T> sink) { T last; for (T t: col) { last = t; sink.flush(t); } return last; }

Lower Bound Example (2) Sink<Object> s; Collection<String> cs; String str = flushAll(cs, s); // Error!

Parametric Polymorphism