260 likes | 416 Views
Andrew Kennedy (Microsoft Research Cambridge) Benjamin Pierce (University of Pennsylvania). On Decidability of Nominal Subtyping with Variance. TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A A A A A A A A A. Compiler demo.
E N D
Andrew Kennedy (Microsoft Research Cambridge)Benjamin Pierce (University of Pennsylvania) On Decidability of Nominal Subtyping with Variance TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAAAAAAAAA
Compiler demo • Java 1.6 class N<Z> { }class C<X> extends N<N<? super C<C<X>>>> { N<? super C<Object>> cast(C<Object> c) { return c; }} • Scala 2.3.1 class N[-Z]class C extends N[N[C]] { def cast(c:C): N[C] = c } • .NET 2.0.class interface N<-Z> { } .class C implements class N<class N<class C>> { .method static class N<class C> cast(class C c) cil managed { .maxstack 1 ldarg.0 ret } Run compilers!
Two features in common • Generic inheritanceclass C<X,Y> extends D<E<X>> implements I<Y> Javaclass C[X,Y] extends D[E[X]] with I[Y] Scala.class C<X,Y> extends class D<class E<!X>> implements class I<!Y> .NET • Generic varianceinterface Func<X,Y> ... Func<? super C, ? extends D> ... Javatrait Func[-X,+Y] Scala.class interface Func<-X,+Y> .NET
Generic inheritance • Inheritance declaration has form C<X1,...,Xn> <:: V, … • Syntax-directed subtyping rule: <:: short for “extends or implements” generic class name supertypes, may use X1,...,Xn formal type parameters C<X1,...,Xn> <:: V V[T1/X1,...,Tn/Xn] <: D<U1,...,Un> (Super) C D C<T1,...,Tn> <: D<U1,...,Un>
Generic variance • Variance declaration e.g. C<+X,-Y,Z> <:: ... • Subtyping rule • Java has wildcards a.k.a. use-site variance. These can mimic declaration-site variance e.g. subtyping direction from viT <:+U means T <: UT <:-U means U <: TT <:± U means T=U variance annotation: +, -, ± C<v1 X1, ..., vnXn>8i, Ti <:viUi (Var) C<T1,...,Tn> <: C<U1,...,Un> T <: U U’ <: T’ C<? extends T, ? super T’, V> <: C<? extends U, ? super U’, V>
What goes wrong? Example 1class N<-X>class C <:: NNC Question: C <: NC ? • (by inheritance rule) NNC <: NC ? • (by variance rule) C <: NC ? Oops. We’re back where we started. short for N<N<C>>>
What goes wrong? Example 2class N<-X>class C<Y> <:: NNCCY Question: CA <: NCB ? • (by inheritance rule) NNCCA <: NCB ? • (by variance rule) CB <: NCCA ? • (by inheritance rule) NNCCB <: NCCA ? • (by variance rule) CCA <: NCCB ? Oops. Types are growing forever... short for N<N<C<C<Y>>>>
Even when it goes right... Example 3class N<-X>class C0<Y> <:: NNYclass C1<Y> <:: C0C0Y...class Cn<Y> <:: Cn-1Cn-1Y Question: CnNA <: NCnA ? Answer: yes, by a derivation that uses 2n+1 instances of variance.
Research outline • Start with “essence of generic Java/Scala/.NET subtyping”: • ground subtyping only (future: open types with bounds on type parameters) • declaration-site variance (same issues, and more, arise in wildcards/use-site variance) • Investigate algorithmics of subtyping. • Presentations of Java-style subtyping are typically declarative e.g. FGJ, Wild FJ, Viroli/Igarashi Variant FGJ • So first step is to present syntax-directed (a.k.a. algorithmic) rules and prove transitivity; equivalence of declarative and algorithmic systems follows • Not trivial – see Appendix of paper for proof
Start with General Problem • Just two restrictions on inheritance: • Acyclicity: if C<T> <:: ... <:: D<U> then C D • Variance-respecting: e.g. C<+X> <:: N<X> illegal if N contravariant Theorem. Subtyping is undecidable. • (Java, Scala, and .NET all impose further restrictions on inheritance, so this result does not transfer)
Post Correspondence Problem • Given a sequence of pairs (u1,v1),...,(un,vn) of words over a finite alphabet find an index sequence i1,...,im such that ui1...uim = vi1...vim Example problem Solution 12314abcaaabcabcaaabc
Undecidability of subtyping: proof • Post Correspondence Problem is undecidable • Reduce instance of PCP to instance of subtyping under some inheritance declarati0ns • Represent letters of alphabet by unary generic classes, define non-generic class E for “end-of-word” class a<X> class b<X> class c<X> class E • Words are represented by repeated type applicationabca a<b<c<a<E>>>>
Undecidability of subtyping: proof • State of search for solution is encoded by subtype problem C<u,v> <: NC<u,v> ? where u and v are the currently-accumulated words, N is contravariant, and choice of next word is encoded by multiple supertypes of C. Class B is used to choose the very first word. All Ni are contravariant, S is invariant.class C<X,Y> <:: NN1C<u1X, v1Y> class B <:: NN1C<u1E, v1E> <:: N1NC<u1X, v1Y> <:: N1NC<u1E, v1E> ... ... <:: NNnC<unX, vnY> <:: NNnC<unE, vnE> <:: NnNC<unX, vnY> <:: NnNC<unE, vnE> <:: NSX <:: SY • It turns out that B <: NB iff ui1...uim = vi1...vim for some i1,...,im
Example class C<X,Y> <:: NN1C<aX,abY> (C1) <:: N1NC<aX,abY> (C1’) ... <:: NN4C<abcX,cY> (Cn) <:: N4NC<abcX,cY> (Cn’) <:: NSX (L) <:: SY (R) Steps of subtyping derivation:B <: NB • … C<bcaaabcE,caaabcE> <: NC<bcaaabcE,caaabcE> • (by C1) NN1C<abcaaabcE,abcaaabcE> <: NC<bcaaabcE,caaabcE> • (by Var) C<bcaaabcE,caaabcE> <: N1C<abcaaabcE,abcaaabcE> • (by C1’) N1NC<abcaaabcE,abcaaabcE> <: N1C<abcaaabcE,abcaaabcE> • (by Var) C<abcaaabcE,abcaaabcE> <: NC<abcaaabcE,abcaaabcE> • (by L) NSabcaaabcE <: NC<abcaaabcE,abcaaabcE> • (by Var) C<abcaaabcE,abcaaabcE> <: SabcaaabcE • (by R) SabcaaabcE <: SabcaaabcE • (by reflexivity) QED.
Ingredients of undecidability • Contravariance(used to send term to other side and back again) • Unbounded growth in size of subtype assertion(used to accumulate concatenation of words) • Multiple instantiation inheritance(used to encode choice of words) • Idea: investigate contribution of each of these ingredients by eliminating them, one at a time.
Ingredient 1: contravariance • Theorem. If no parameters are contravariant, then subtyping is decidable. • Proof. Define well-founded order on subtype assertions: (T1 <: U1) < (T2 <: U2)iffsize(U1) < size(U2) or size(U1) = size(U2) and T2<::+ T1This order decreases from conclusion to premises in the subtyping rules; so rule-based algorithm terminates.
Ingredient 2: unbounded growth of types Definitions. • A set of types S is inheritance closed if the following conditions hold: • Inheritance: if T2 S and T <:: U then U2S, and • Decomposition: if C<T1,...,Tn>2 S then T1,...,Tn2S • The inheritance closure of a set is the least superset that is inheritance closed. • Class declarations are finitaryif inheritance closure of any finite set of types is finite. Example 2. Inheritance closure of { CA } is infinite, includes { A, CA, CCA, CCCA, ... }, so definitions above are not finitary. Theorem. For finitary inheritance, subtyping is decidable. Proof. Algorithm simply maintains a list of “visited” goals to detect cycles. As inheritance closure is finite, the algorithm explores only a finite set of types, and hence terminates. class N<-X>class C<Y> <:: NNCCY
Characterizing infinitary inheritance • Syntactic characterization (due to Viroli): create type parameter dependency graph which represents uses of formal type parameters in inheritance declarations • Nodes are formal parameters • Non-expansive edges represent “naked” uses of type parameters • Expansive edges represent “nested” uses of type parameters • Inheritance is infinitaryiff a cycle contains an expansive edge Example 1aclass N<-X>class D<Z> <:: NNDZ Example 2class N<-W>class C<Y> <:: NNCCY W Y X Z Finitary Infinitary
Characterizing infinitary inheritance • Syntactic characterization (due to Viroli): create type parameter dependency graph which represents uses of formal type parameters in inheritance declarations • Nodes are formal parameters • Non-expansive edgesrepresent “naked” uses of type parameters • Expansive edgesrepresent “nested” uses of type parameters • Inheritance is infinitaryiff a cycle contains an expansive edge Example 1aclass N<-X>class D<Z> <:: NNDZ Example 2bclass N<-W>class C<Y> <:: NNECYclass E<V> <:: C<V> W Y V X Z Finitary Infinitary
Ingredient 3: multiple instantiation inheritance • C# permits implementation of same generic interface at different instantiations e.g. class E : IEnumerator<int>, IEnumerator<string> • Instantiations must be non-overlapping e.g. class C<X> : I<X>, I<object> class D<Y,Z> : I<Y>, I<Z>are illegal. • Java outlaws multiple instantiation inheritance (it can’t be implemented by type erasure) • Question: does this make subtyping decidable?
No back-tracking • In the absence of multiple instantiation inheritance, we have T <::* C<U1,...,Un> Æ T <::* C<V1,...,Vn> )8i, Ui = Vi(<::* is reflexive transitive closure of single-step inheritance). i.e. instantiations are uniquely determined by inheritance. • We can then reformulate subtyping so that derivations are unique; the algorithm can proceed without back-tracking. We combine inheritance and variance into a single rule. T <::* D<T1,...,Tn> 8i, Ti <:viUi (SuperVar) D<v1 X1,...,vnXn> T <: D<U1,...,Un>
Accessibility Example 2aclass N<-X> class D<Z>class C<Y> <:: NNCDY Question: CA <: NCB ? • (by SuperVar) CB <: NCDA ? • (by SuperVar) CDA <: NCDB ? • (by SuperVar) CDB <: NCDDA ? • (by SuperVar) CDDA <: NCDDB ? • ...
Accessibility Example 2aclass N<-X> class D<Z>class C<Y> <:: NNCDY Question: CA <: NCB ? • (bySuperVar)CB <: NCDA? • (bySuperVar) CDA <: NCDB ? • (by SuperVar) CDB <: NCDDA ? • (by SuperVar) CDDA <: NCDDB ? • ... Observation 1 Types A and B do not affect validity: they are “inaccessible”. In fact, everything underneath C is inaccessible. So by checking equivalence “up to accessibility”, we can detect looping. Observation 2The inaccessible region of the assertion grows unboundedly. The accessible region is bounded in size.
Characterizing accessibility Example 2aclass N<-X> class D<Z>class C<Y> <:: NNCDY X Y Z Parameter Y is “expansive-recursive”: it appears in an expansive cycle in the type parameter dependency graph. Instantiations of Y are inaccessible because Invariance of Y => variance rule does not “uncover” an instantiation Recursion through Y => inheritance always instantiates Y with another type involving C (in more complex examples, in mutual recursion with C) • Definition • C<T1,...,Tn> ~ D<U1,...,Un> (“equivalent up to accessibility”) when C=D and for each i, either i’th parameter of C is expansive-recursive or Ti ~ Ui • (T <: T’) ~ (U <: U’) when T~U and T’~U’
Decidability argument • Lemma.Suppose J1 ~ J2 for subtype judgments J1 and J2. If J1! J1’ then J2! J2’ for some J2’ such that J1’ ~ J2’.Corollary: if J !+ J’ and J ~ J’ then J !1 • Lemma. For a given set of inheritance declarations, there exists some bound such that:accessible-depth(J)· and J!J’ ) accessible-depth(J’)· • Corollary. If all expansive-recursive parameters are invariant and used exactly once, then subtyping is decidable.
Discussion • Subtyping in .NET is decidable • .NET outlaws infinitary inheritance to ensure termination of eager supertype loading • Our decidability result applies to ground subtyping; we believe it’s easy to extend the result to open subtyping with type parameter bounds • Decidability of subtyping in Scala and Java is still open • It would be nice to generalize the last result to remove the variance/linearity restriction • This would imply decidability of Scalasubtyping (we think) • Java wildcards are more complex: even the context can grow unboundedly