Evaluating and Tuning a Static Analysis to Find Null Pointer Bugs

Evaluating and Tuning a Static Analysis to Find Null Pointer Bugs Dave Hovemeyer Bill Pugh Jaime Spacco

How hard is it to find null-pointer exceptions? • Large body of work • academic research • too much to list on one slide • commercial applications • PREFix / PREFast • Coverity • Polyspace

Lots of hard problems • Aliasing • Infeasible paths • Resolving call targets • Providing feedback to developers under what conditions an error can happen

Can we use simple techniques to find NPE? • Yes, when you have code like: // Eclipse 3.0.1 if (in == null) try { in.close(); } catch (IOException e) {} • Easy to confuse == and !=

Easy to confuse && with || // JBoss 4.0.0RC1 if (header != null || header.length > 0) { ... } • This type of error (and less obvious bugs) occur in production mode more frequently than you might expect

The FindBugs Project • Open-Source static bug finder • http://findbugs.sourceforge.net • 127,394 downloads as of Saturday • Java bytecode • Used at several companies • Goldman-Sachs • Bug-driven bug finder • start with a bug • What’s the simplest analysis to find the bug?

FindBugs null pointer analysis • Intra-procedural analysis • Compute all reaching paths for a value • Take conditionals into account • Use value numbering analysis to update all copies of updated value • No modeling of heap values • Don’t report warnings that might be false positives due to infeasible paths • Extended basic analysis with limited inter-procedural analysis using annotations

DataflowLattice

Null on a Simple Path (NSP) • Merge null with anything else • We only care that there is control flow where the value is null • We don’t try to identify infeasible paths • The NPE happens if the program achieves full branch coverage

Null on a Simple Path (NSP)

Null on a Complex Path (NCP) • Most conservative approximation • Tell the analysis we lack sufficient information to justify issuing a warning when the value is dereferenced • so we don’t issue any warnings • Used for: • method parameters • Instance variables • NSP values that reach a conditional

No Kaboom Non-Null • Definitely non-null because the pointer was dereferenced • Suspicious when programmer compares a No-Kaboom value against null • Confusion about program specification or contracts

// Eclipse 3.0.1 // fTableViewer is method parameter property = fTableViewer.getColumnProperties(); ... if (fTableViewer != null) { ... }

// Eclipse 3.0.1 // fTableViewer is method parameter // fTableViewer : NCP property = fTableViewer.getColumnProperties(); ... if (fTableViewer != null) { ... }

// Eclipse 3.0.1 // fTableViewer is method parameter // fTableViewer : NCP property = fTableViewer.getColumnProperties(); // fTableViewer : NoKaboom nonnull ... if (fTableViewer != null) { ... }

// Eclipse 3.0.1 // fTableViewer is method parameter // fTableViewer : NCP property = fTableViewer.getColumnProperties(); // fTableViewer : NoKaboom nonnull ... // redundant null-check => warning! if (fTableViewer != null) { ... }

Redundant Checks for Null (RCN) • Compare a value statically known to be null (or non-null) with null • Does not necessarily indicate a problem • Defensive programming • Assume programmers don’t intend to write (non-trivial) dead code

Extremely Defensive Programming // Eclipse 3.0.1 File dir = new File(...); if (dir != null && dir.isDirectory()) { ... }

Non-trivial dead code x = null … does not assign x… if (x!=null) { // non-trivial dead code x.importantMethod() }

What do we report? • Dereference of value known to be null • Guaranteed NPE if dereference executed • Highest priority • Dereference of value known to be NSP • Guaranteed NPE if the path is ever executed • Exploitable NPE assuming full branch coverage • Medium priority • If paths can only be reached if an exception occurs • lower priority

Reporting RCNs • No-Kaboom RCNs • higher priority • RCNs that create dead code • medium priority • other RCNs • low priority

Evaluate our analysis using: • Production software • jdk1.6.0-b48 • glassfish-9.0-b12 (Sun's application server) • Eclipse 3.0.1 • Manually classified each warning • Student programming projects

Production Results

Eclipse Results with Manual Inspection of warnings

How many of the existing NPEs are we detecting? • Difficult question for production software • Student code base allows us to study all NPE produced by a large code base covered by fairly complete unit tests • How many NP Warnings correspond with a run-time fault? • False Positives • How many NPE do we issue a warning for? • False Negatives

The Marmoset Project • Automated snapshot, submission and testing system • Eclipse plug-in captures snapshots of all saves to central repository • Students submit code to a central server for testing against suite of unit tests • End of semester we run all snapshots against tests • Also run FindBugs on all intermediate snapshots

Overall numbers, Fall 2004, 2nd semester OOP course

Analyzing Marmoset results • Analyze two projects • Binary Search Tree • WebSpider • Difficult to decide what to count • per snapshot, per warning, per NPE? • false positives persist and get over-counted • multiple warnings / NPEs per snapshot • exceptions can mask each other • difficult to match warnings and NPEs

Marmoset Results

What are we missing? • Projects have javadoc specifications about which parameters and return values can be null • Encode specifications into a format FindBugs can use for limited inter-procedural analysis • Easy to add annotations to the interface students were to implement • Though we did this after the semester

Annotations • Lightweight way to communicate specifications about method parameters or return values • @NonNull • issue warning if ever passed a null value • @CheckForNull • issue warning if unconditionally dereferenced • @Nullable • null in a complicated way • no warnings issued

@CheckForNull vs @Nullable • By default, all values are implicitly @Nullable • Mark an entire class or package @NonNull or @CheckForNull by default • Must explicitly mark some values as @Nullable • Map.get() can return null • Not every application needs to check every call to Map.get()

Marmoset Results with Annotations

Related Work • Lint (Evans) • Metal (Engler et al) • “Bugs as Deviant Behavior” • ESC Java • more general annotations • Fahndrich and Leino • Non-null types for C#

Conclusions • We can find bugs with simple methods • in student code • in production code • student bug patterns can often be generalized into patterns found in production code • Annotations look promising • lightweight way of simplifying inter-procedural analysis • helpful when assigning blame

Thank you! Questions?

Difficult to decide what to count • False positives tend to persist • over-counted • Students fix NPEs quickly • under-count • Multiple warnings / exceptions per snapshot • Some exceptions can mask other exceptions

Evaluating and Tuning a Static Analysis to Find Null Pointer Bugs