340 likes | 474 Views
Scalable Pluggable Types. Michael Ernst MIT Dagstuhl Seminar “Scalable Program Analysis” April 15, 2008. Scalability of type systems. Scalable can mean: Big programs Real languages, expressive programs Type checking scales to big programs Type checking sometimes scales to real languages
E N D
Scalable Pluggable Types Michael Ernst MIT Dagstuhl Seminar “Scalable Program Analysis” April 15, 2008
Scalability of type systems • Scalable can mean: • Big programs • Real languages, expressive programs • Type checking scales to big programs • Type checking sometimes scales to real languages • Not necessarily scalable: • The utility of a type system • Type inference
Evaluating type systems • Some common evaluation strategies: • Soundness proof for core calculus • Toy challenge examples • A type system is valuable only if it helps developers to find and prevent errors • This is difficult to establish! • Robust implementation • Case studies
Background • Javari type system for reference immutability • OOPSLA 2005 paper: 160KLOC case studies • IGJ type system for reference and object immutability • FSE 2007 paper: 106KLOC case studies • Case studies provided essential preliminary insight • Building compilers was tedious • Users are unlikely to use a special-purpose compiler
Goal: Better type system evaluation • Improve the evaluation of type systems • By making it easier • Legacy code, not just new code • In a real programming language (Java) • Handles type system refinements, not incompatible changes
Contributions • Programming language support for syntax • Framework for writing type checkers • 5 checkers written using the framework • Case studies enabled by the infrastructure • Insights about the type systems
Programming language support for syntax • Extension to Java annotation syntax List<@NonNull String> strings; myGraph = (@Immutable Graph) tmpGraph; class UnmodifiableList<T> implements @Readonly List<@Readonly T> {} • Carried through to classfile • Planned for inclusion in Java 7 • JSR 308: Annotations on Java types • Backward-compatible publicly-available implementation
Checkers Frameworkfor writing type checkers • Just override a few methods for special checks • Use Sun’s Tree API to access AST • Declarative syntax for common cases • Type qualifier hierarchy • Implicit types • Simple type systems require no code • But sophisticated type systems are possible • Handles both super- and sub-type qualifiers • Generic type inference • Flow-sensitive type inference • Plug-in to Java compiler (a.k.a. annotation processor)
Type checkers • Null dereferences (@NonNull, @Nullable) • Errors in equality testing (@Interned) • Reference immutability (Javari) • Reference & object immutability (IGJ) In several cases the first complete implementation (at least 3 previous tries for Javari), in other cases the best available.
Case studies • Using all 4 checkers: • As of summer 2007 (see TR for details): • 360KLOC • 75 bugs verified by a human and fixed
Comparison with other tools Checking a 4KLOC program Type checking finds all the null pointer bugs The other tools also find other bugs
Good defaults reduce user effort • Nullable default: • Dictated by backward compatibility • Many instances of @NonNull – too verbose • Flow-sensitive inference of NonNull for local variables • NonNull default (chosen by many designers) • Less verbose in signatures • Draws attention to exceptions rather than the rule • More annotations in method bodies • Our system: Non-null except locals (which are Nullable) • An alternative to full local type inference; similar effect • Natural for programmers (based on observation of case studies) • Ratio of number of annotations:
Usability of pluggable checkers • Scales to >200,000 LOC • Found bugs in every codebase • Few false positives • Programmers found the checkers easy to use • Is it too verbose? • @NonNull: 1 per 75 lines • @Interned: 124 annotations in 220KLOC revealed 11 bugs • Possible to annotate part of program • Fewer annotations in new code
Lessons learned: type systems • New approach to finding equality errors • Purely type system, fully backward compatible • Ambiguous rule in published description of Javari • Sound but unnecessarily imprecise rules in JQual • Uses of class, object, and reference immutability in the same program
Lessons learned: Polymorphism • Qualifier polymorphism is necessary • Javari has 3 types that are mutually incomparable • Qualifier polymorphism • No need to rewrite code to use generics • Generics are inadequate due to context sensitivity • Java generics • Wildcards extension makes collections more usable • Limited qualifier polymorphism dominates fully general • Inference is linear, not exponential • Found no examples (yet) requiring fully general • Supporting generics dominated all other problems • Not just engineering • Supporting arrays is a qualitative difference
Lessons learned: NonNull • New default scheme for qualifiers • Generalizes beyond NonNull • NonNull is not a good example • Even though it’s the most popular example! • Programmers overload null with much meaning • Many more application invariants • “Null” carries much semantic meaning • Each application invariant yields a false warning • Suppress with null checks or an annotation • Not clear how to fix (e.g., dependent types) • Requires flow sensitivity • Other checkers benefit much less
Lessons learned: using a checker • Annotations on receiver are useful (and distinct from method annotations) • Interned, Javari, IGJ • Integration • Tool support is critical for users • Some refused to use our tool without Eclipse support • Checker and compiler should be integrated but decoupled • Can focus on just part of the program
Lessons learned: writing a checker • Purely declarative syntax is not practical for specifying type hierarchy or rules • Even for relatively simple type systems • Examples: • Javari & IGJ hierarchy • Interned heuristics • Clean integration with an API is essential • Not hard to use: Javari & IGJ written by a sophomore and a junior with no experience • Declarative syntax may be essential when writing dozens of type systems
Lessons learned: inference • Not necessary in many cases • Follow existing comments • Focus on part of program • Existing tools are inadequate • Hard to scale: shows necessity for a usable framework • Gave up and wrote our own partial Nullable inference in a weekend • We need better frameworks for inference, too • See ECOOP 2008 paper on Javari inference
Demos • Demos in Thursday demo session • Or catch me anytime • I will be away from Dagstuhl from Tue night to Wed night
Conclusion • Framework for writing type checkers • Robust, scalable, easy to use, reveals bugs • http://pag.csail.mit.edu/jsr308 • Or, web search for “JSR 308” or “Annotations on Java types” • Aids programmers in preventing errors • Yields insight into type systems • Enables evaluation of new type systems
Annotations on Java types (JSR 308) Two problems with Java: • Syntactic limitation on annotations • Can only be written on declarations • Semantic limitation of the type system • Doesn't prevent enough bugs JSR 308 solves these problems: • Extends Java syntax: permits annotations in more locations • Enables creation of more powerful annotation processors
Synactic problem: Annotations can only be written on declarations • Classes: package java.security; @Deprecated class Signer { ... } • Methods: @Test void additionWorks() { assert 1 + 1 == 2; } @Override equals(MyClass other) { ... } // warning • Fields: @CommandLineArg(name="input", required=true) private String inputFilename; • Locals/statements: List<Object> objs = ...; @SuppressWarnings List<String> strings = objs; • Goal: write annotations on type uses
Generics and arrays • Generics: List<@NonNull String> strings; class UnmodifiableList<T> implements @Readonly List<@Readonly T> { ... } • Arrays are treated analogously • Separately annotate the element type and the array itself
Local variables @Interned String s = getName().intern(); @NonEmpty List<String> strings = ...; • Possible to annotate in Java 5 • Annotations are not preserved in the class file
Casts Graph g = new Graph(); ... ... // Now, g will not be changed any more g = (@Immutable Graph) g; // Both variables are null, or neither is. Pattern startRegex, endRegex; ... if (startRegex != null) { endRegex = (@NonNull Pattern) endRegex; ... } } add nodes and edges
Receiver • It is possible to annotate the explicit parameters package javax.xml.bind; class Marshaller { void marshal(@Readonly Object jaxbElement, @Mutable Writer writer) { } } • It should be possible to annotate the receiver, too package javax.xml.bind; class Marshaller { void marshal(@Readonly Object jaxbElement, @Mutable Writer writer) @Readonly { } }
Semantic problem: Weak type checking • Type checking prevents many bugs int i = “42”; • Type checking doesn't prevent enough bugs getValue().toString(); // NullPointerException • Cannot express important properties about code • Non-null, interned, immutable, encrypted, tainted, ... • Solution: pluggable type systems • Design a type system to solve a specific problem • Annotate your code with type qualifiers • Type checker warns about violations (bugs)
Pluggable checkers in practice • Scales to >200,000 LOC • Found bugs in every codebase • Comparison to other null dereference checkers: Errors False Anno- found missed warn.tations JSR 308 7 0 7 45 FindBugs 0 7 1 0 Jlint 0 7 8 0 PMD 0 7 0 0
Usability • Programmers found the checkers easy to use • Is it too verbose? • @NonNull: 1 per 75 lines • @Interned: 124 annotations in 220KLOC revealed 11 bugs • Possible to annotate part of program • Fewer annotations in new code • Is it hard to build a new checker? • Most users don't have to • Basic functionality: mention annotation on command line • More advanced functionality: using the Checkers Framework, just override a few methods
How to get involved • Webpage: Google “JSR 308” or “Annotations on Java types” (http://groups.csail.mit.edu/pag/jsr308/) • JSR 308 compiler (patch to OpenJDK) • 5 checkers • @NonNull • @Interned • @Readonly • @Immutable • Command line (basic semantics, for any annotation name) • Checkers Framework (for writing new checkers) • Other tools supporting JSR 308 • JSR 308 specification document • Mailing list • Go forth and prevent bugs!