420 likes | 617 Views
Parametric Polymorphism. Antonio Cisternino Giuseppe Attardi Università di Pisa. Parametric Polymorphism. C++ templates implement a form of parametric polymorphism PP is implemented in many flavors and many languages: Eiffel, Mercury, Haskell, ADA, ML, C++…
E N D
Parametric Polymorphism Antonio Cisternino Giuseppe Attardi Universitàdi Pisa
Parametric Polymorphism • C++ templates implement a form of parametric polymorphism • PP is implemented in many flavors and many languages: Eiffel, Mercury, Haskell, ADA, ML, C++… • Improve the expressivity of a language • May improve the performance of programs • It is a form of Universal polymorphism
C++ templates and macros • Macros are dealt by the preprocessor • C++ templates are implemented on the syntax tree • The instantiation strategy is lazy • The following class compiles unless the method foo is used: template <class T>class Foo { T x; intfoo() { return x + 2; } }; Foo<char*> f; f.x = “”; f.foo();
A more semantic approach • Parametric polymorphism has been introduced also in Java and C# • Java Generics and Generic C# for .NET • In both cases the compiler is able to check parametric classes just looking at their definition • Parametric types are more than macros on AST • Syntax for generics is similar in both JG and C#
Generics in a Nutshell • Type parameterization for classes, interfaces, and methods e.g. class Set<T> { ... } // parameterized classclass Dict<K,D> { ... } // two-parameter classinterface IComparable<T> { ... } // parameterized interfacestruct Pair<A,B> { ... } // parameterized struct (“value class”) T[] Slice<T>(T[] arr, int start, int count) // generic method • Very few restrictions on usage: • Type instantiations can be primitive (only C#) or class e.g. Set<int> Dict<string,List<float>> Pair<DateTime, MyClass> • Generic methods of all kinds (static, instance, virtual) • Inheritance through instantiated types e.g.class Set<T> : IEnumerable<T>class FastIntSet : Set<int> In GJ is <T> T[] Slice(…) Virtual methods only in GC#!
More on generic methods • Generic methods are similar to template methods in C++ • As in C++ JG tries to infer the type parameters from the method invocation • C# requires specifying the type arguments • Example: template <class T> T sqr(T x) { return x*x; } std::cout << sqr(2.0) << std::endl; class F { <T> static void sort(T[] a) {…} } String[] s; F.sort(s); class F { static void sort<T>(T[] a) {…} } string[] s; F.sort<string>(s); C++ JG C#
Generic Stack class Stack<T> { private T[] items; private intnitems; Stack<T> { nitems = 0; items = new T[] (50); } T Pop() { if (nitems == 0) throw Empty(); return items[--nitems]; } boolIsEmpty() { return (nitems == 0); } void Push(T item){ if (items.Length == nitems) { T[] temp = items; items = new T[nitems*2]; Array.Copy(temp, items, nitems); } items[nitems++] = item; } } How does the compiler check the definition?
Tip • C++ requires a space in nested parameter types: vector<vector<int> > to avoid ambiguity with operator >> • GJ (and C#) fixed the problem with the following grammar: ReferenceType ::= ClassOrInterfaceType | ArrayType | TypeVariable ClassOrInterfaceType ::= Name | Name < ReferenceTypeList1 ReferenceTypeList1 ::= ReferenceType1 | ReferenceTypeList , ReferenceType1 ReferenceType1 ::= ReferenceType > | Name < ReferenceTypeList2 ReferenceTypeList2 ::= ReferenceType2 | ReferenceTypeList , ReferenceType2 ReferenceType2 ::= ReferenceType >> | Name < ReferenceTypeList3 ReferenceTypeList3 ::= ReferenceType3 | ReferenceTypeList , ReferenceType3 ReferenceType3 ::= ReferenceType >>> TypeParameters ::= < TypeParameterList1 TypeParameterList1 ::= TypeParameter1 | TypeParameterList , TypeParameter1 TypeParameter1 ::= TypeParameter > | TypeVariable extends ReferenceType2 | TypeVariable implements ReferenceType2
The semantic problem • The C++ compiler cannot make assumptions about type parameters • The only way to type-check a C++ class is to wait for argument specification (instantiation): only then it is possible to check operations used (i.e. comp method in sorting) • From the standpoint of the C++ compiler semantic module all types are not parametric
Checking class definition • To be able to type-check a parametric class just looking at its definition we introduce the notion of bound • As in method arguments have a type, type arguments are bound to other types • The compiler will allow to use values of such types as if upcasted to the bound • Example: class Vector<T : Sortable> • Elements of the vector should implement (or inherit from) Sortable
Example interface Sortable<T> { int compareTo(T a); } class Vector<T : Sortable<T>> { T[] v; int sz; Vector() { sz = 0; v = new T[15]; } void addElement(T e) {…} void sort() { … if (v[i].compareTo(v[j]) > 0) … } } Not possible in Java, because Sortable is an interface and type T is lost. Compiler can type-check this because v contains values that implement Sortable<T>
Pros and Cons • A parameterized type is checked also if no instantiation is present • Assumptions on type parameters are always explicit (if no bound is specified Object is assumed) • Is it possible to make assumptions beyond bound? • Yes, you can always cheat by upcasting to Object and then to whatever you want: class Foo<T : Button> { void foo(T b) { String s = (String)(Object)b; } } • Still the assumption made by the programmer is explicit
Implementation • Alternative implementations of parametric polymorphism: • C++ generates Abstract Syntax Tree for method and classes • GJ implements generic types at compile time: the JVM is not aware of parametric types • C# assumes that CLR is aware of parametric types: the IL has been extended with generic instructions to handle with type parameters
Java Generics strategy • JG is an extension of Java • The compiler verifies that generic types are used correctly • Type parameters are dropped and the bound is used instead; downcasts are inserted in the right places • The output is a normal class file unaware of parametric polymorphism
Example class Vector<T> { T[] v; int sz; Vector() { v = new T[15]; sz = 0; } <U implements Comparer<T>> void sort(U c) { … c.compare(v[i], v[j]); … } } … Vector<Button> v; v.addElement(new Button()); Button b = v.elementAt(0); class Vector { Object[] v; intsz; Vector() { v = new Object[15]; sz = 0; } void sort(Comparer c) { … c.compare(v[i], v[j]); … } } … Vector v; v.addElement(new Button()); Button b = (Button)b.elementAt(0);
Wildcard class Pair<X,Y> { X first; Y second; } public String pairString(Pair<?, ?> p) { return p.first + “, “ + p.second; }
Expressivity vs. efficiency • JG doesn’t improve execution speed; though it helps to express genericity better than inheritance • Major limitation in JG expressivity: exact type information is lost at runtime • All instantiations of a generic type collapse to the same class • Consequences are: no virtual generic methods and pathological situations • Benefit: Java classes could be seen as generic types! Reuse of the large existing codebase • JG isn’t the only implementation of generics for Java
Generics and Java System Feature
Problem with JG Stack<String> s = new Stack<String>(); s.push("Hello"); Stack<Object> o = s; Stack<Button> b = (Stack<Button>)o; // Class cast exception Button mb = b.pop(); Cast authorized: both Stack<String> and Stack<Button> map to class Stack
Generic C# Strategy: GCLR • Kennedy and Syme have extended CLR to support parametric types (the same proposal has been made for PolyJ by Cartwright and Steele) • In IL placeholders are used to indicate type arguments (!0, !1, …) • The verifier, JIT and loader have been changed • When the program needs an instantiation of a generic type the loader generates the appropriate type • The JIT can share implementation of reference instantiations (Stack<String> has essentially the same code of Stack<Object>)
Generic C# compiler • GC# compiler implements a JG like notation for parametric types • Bounds are the same as in JG • NO type-inference on generic methods: the type must be specified in the call • The compiler relies on GCLR to generate the code • Exact runtime types are granted by CLR so virtual generic methods are allowed • All type constructors can be parameterized: struct, classes, interfaces and delegates.
Example using System; namespace n { public class Foo<T> { T[] v; Foo() { v = new T[15]; } public static void Main(string[] args) { Foo<string> f = new Foo<string>(); f.v[0] = "Hello"; string h = f.v[0]; Console.Write(h); } } } .field private !0[] v .method private hidebysig specialnamertspecialname instance void .ctor() cil managed { .maxstack 2 ldarg.0 call instance void [mscorlib]System.Object::.ctor() ldarg.0 ldc.i4.s 15 newarr !0 stfld !0[] class n.Foo<!0>::v ret } // end of method Foo::.ctor
Performance • The idea of extending CLR with generic types seems good; but how about performance? • Although the instantiation is performed at load time the overhead is minimal • Moreover code sharing reduces instantiations, improving execution speed • A technique based on dictionaries is employed to keep track of previous instantiated types
Expressive power of Generics • System F is a typed -calculus with polymorphic types • While Turing-equivalence is a trivial property of programming languages, for a type-system being equivalent to System F it is not • Polymorphic languages such as ML and Haskell cannot fully express System F (both languages have been extended to fill the gap) • System F can be transposed into C# http://www.cs.kun.nl/~erikpoll/ftfjp/2002/KennedySyme.pdf
Reminder: substitutivity • Sub-Typing/Sub-Classing defines the class relation “B is a sub-type of A”, marked B <: A. • According to the substitution principle, if B <: A, then an instance of B can be substituted for an instance of A. • Therefore, it is legal to assign an instance bof B to a variable of type A A a = b
Generics and Subtyping • Does the rules for sub-types and assignment works for generics? If B <: A, then G<B> <: G<A>? Counter example List<String> ls = new List<String>(); List<Object> lo = ls; // Since String <: Object, so far so good. lo.add(new Object()); String s = ls.get(0); // Error! The rule B <: A G<B> <: G<A> defies the principle of substitution!
Other example class B extends A { … } class G<E> { public E e; } G<B> gb = new G<B>(); G<A> ga = gb; ga.e = new A(); B b = gb.e; // Error! Given B <: A, and assuming G<B> <: G<A>, then: G<A> ga = gb; would be legal. Actually, type is erased.
Bounded Wildcard A wildcard does not allow doing much To provide operations with wildcard types, one can specify bounds: Upper Bound The ancestor of unknown:G<? extends X> Lower Bound The descendant of unknown:G<? super Y>
Bounded Wildcards Subtyping Rules For any B such that B <: A: • G<B> <: G<? extends A> • G<A> <: G<? super B>
Bounded Wildcards - Example G<A> ga = new G<A>(); G<B> gb = new G<B>(); G<? extends A> gea = gb; // Can read from A a = gea.e; G<? super B> gsb = ga; // Can write to gsb.e = new B(); G<B> <: G<? extends A> hence legal G<A> <: G<? super B> hence legal
Generics and Polymorphism class Shape { void draw() {…} } class Circle extends Shape { void draw() {…} } class Rectangle extends Shape { void draw() {…} } public void drawAll(Collection<Shape> shapes) { for (Shape s: shapes) s.draw(); } • Does not work. Why? • Cannot be used on Collection<Circle>
Bounded Polymorphism • Bind the wildcard: replace the type Collection<Shape> with Collection<? extends Shape>: public void drawAll(Collection<? extends Shape> shapes) { for (Shape s: shapes) s.draw(); } • Now drawAll() will accept lists of any subclass of Shape • The ? Stands for an unknown subtype of Shape • The type Shape is the upper bound of the wildcard
Bounded Wildcard • There is a problem when using wildcards: public void addCircle(Collection<? extends Shape> shapes) { shapes.add(new Circle()); } • What will happen? Why?
To wildcard or not to wildcard? • That is the question: interface Collection<E> { public booleancontainsAll(Collection<?> c); public booleanaddAll(Collection<? extends E> c); } interface Collection<E> { public <T> booleancontainsAll(Collection<T> c); public <T extends E> booleanaddAll(Collection<T> c); }
Lower Bound Example interface sink<T> { flush(T t); } public <T> T flushAll(Collection<T> col, Sink<T> sink) { T last; for (T t: col) { last = t; sink.flush(t); } return last; }
Lower Bound Example (2) Sink<Object> s; Collection<String> cs; String str = flushAll(cs, s); // Error!
Lower Bound Example (3) public <T> T flushAll(Collection<T> col, Sink<T> sink) { … } … String str = flushAll(cs, s); // Error! T is now solvable as Object, but it is not the correct type: should be String
Lower Bound Example (4) public <T> T flushAll(Collection<T> col, Sink<? Super T> sink) { … } … String str = flushAll(cs, s); // OK!
Combining generics and inheritance • The inheritance relation must be extended with a new subtyping rule: • Can now cast up and down to Object safely • Note: types must be substituted because the super-class can be parametric Givenclass C<T1,...,Tn> extends Bwe have C<t1,...,tn> <: B[t1/T1, ..., tn/Tn]
Manipulating types • Grouping values into types has helped us to build better compilers • Could we do the same with types? • Types can be grouped by means of inheritance which represents the union of type sets • Parametric types combined with inheritance allow expressing function on types: class Stack<T:object> : Container Function name Function arguments Result type
Example: generic containers class Row<T : Control> : Control { /* row of graphic controls *> } class Column<T : Control> : Control { /* column of graphic controls */ } class Table<T : Control> : Row<Column<T>> { /* Table of graphic controls */ } … // It generates the keypad of a calculator Table<Button> t = new Table<Button>(3, 3); for (inti = 0; i < 3; i++) for (int j = 0; j < 3; j++) t[i, j].Text = (i * 3 + j + 1).ToString();