1 / 18

Dynamically Discovering Likely Program Invariants to Support Program Evolution

Dynamically Discovering Likely Program Invariants to Support Program Evolution. Michael D. Ernst, Jake Cockrell, William G. Griswold, David Notkin Presented by: Nick Rutar. Program Invariants. Useful in software development Protect programmers from making errant changes

Download Presentation

Dynamically Discovering Likely Program Invariants to Support Program Evolution

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Dynamically Discovering Likely Program Invariants to Support Program Evolution Michael D. Ernst, Jake Cockrell, William G. Griswold, David Notkin Presented by: Nick Rutar

  2. Program Invariants • Useful in software development • Protect programmers from making errant changes • Verify properties of a program • Can be explicitly stated in programs • Programmers can annotate code with invariants • This can take time and effort • Many important invariants will be missed

  3. Daikon - Dynamic Invariant Detector • Dynamic -- From Program Executions • Step 1: Instrument Source Program • Trace Variables of Interest • Step 2: Run Instrumented Program Over Test Suite • Step 3: Infer Invariants from • Instrumented Variables • Derived Variables

  4. i = 0; s = 0; do i ≠ n  i = i + 1 s = s + b[i] Precondition: n ≥ 0 Postcondition: s = ( j : 0 ≤ j < n : b[j]) Loop Invariant: 0 ≤ i ≤ n and s = ( j : 0 ≤ j < i : b[j]) Example Program(taken from “The Science of Programming”)

  5. ENTER N = size(B) N in [7 … 13] B - All elements ≥ -100 EXIT N = I = orig(N) = size(B) B = orig(B) S = sum(B) N in [7 … 13] B - All elements ≥ -100 LOOP N = size(B) S = sum(B[0 … I -1]) N in [7 … 13] I in [0 … 13] I ≤ N B - all elements in [-100.100] sum(B) in [-556.539] B[0] nonzero in [-99.96] B[-1] in [-88.99] N != B[-1] (negative) B[0] != B[-1] (negative) Daikon results from the program(100 randomly generated input arrays of length 7-13)

  6. Instrumentation • Insert instrumentation points • Procedure Entry • Procedure Exit • Loop Heads • Writes to a file values for • All variables in scope • Global Variables • Procedure arguments • Local Variables • Procedure’s return value • Available for Platforms • LISP • C/C++ • Java (from Daikon website) • Eclipe plug-in available • Perl (from Daikon website)

  7. Inferring invariants • System checks for the following (x,y,z variables; a,b,c computed constants): • Any variable • constant or small number of values • Numeric variable • range (a ≤ x ≤ b) • modulus & nonmodulus • Multiple numbers • linear relationship (such as x = ay + bz + c) • functions (all those in standard lib, e.g. x = abs(y)) • comparisons (x < y, x ≥ y, x == y) • invariants over x + y and x -y • Sequence: • sortedness • invariants over all elements (e.g., every element < 100) • Multiple sequences • subsequence & lexicographic relationship • Sequence and scalar • membership

  8. Inferring invariants (continued) • Each potential variant is tested • When invariant doesn’t hold, not tested again • Negative Invariants • Relationships that are expected but don’t occur from input • Probability limit decides if invariants are included • Derived Variables • Expressions treated same as regular variables • Include: • From any array: first and last elements, length • From numeric array: sum, min, max • From array and scalar: element at that index(a[i]), subarray up to, and subarray beyond, that index • From function invocation: number of calls so far

  9. Using Invariants • Modified Siemens replace (~500 LOC) program • Takes in regular expression and replacement string as input • Copies input stream to output stream replacing matched strings • Added input pattern <pat>+ to <pat><pat>* • Use invariants for glimpse on how program runs • Found occurrences where initial belief was contradicted • Prevented introducing bugs based on flawed knowledge of code • Found instance of unreported array bounds error

  10. Using invariants (continued) • Everything learned from “replace” could have been learned by combination of • Reading the code • Static Analyses • Selected Program Instrumentation • Invariants give benefits that other approaches do not • Inferred invariants are abstraction of larger amount of data • Flags raised with unexpected invariants or expected invariants not appearing • Queries against database build intuition about source of invariant • Inferred invariants provide basis for programmer inferences • Invariants provide beneficial degree of serendipity

  11. Results - Time • Ran tests with between 500-3000 test inputs for replace • Inferred ~71 variables per inst point in replace • 6 original, 65 derived, 52 scalars, 19 sequences • On average, 10 derived for every original • 1000 test cases • Produce 10,120 samples per instrumentation point • System takes 220 seconds to infer invariants • 3000 test cases • 33,801 samples • Processing takes 540 seconds • Invariant detection time grows quadratically with the number of variables over which invariants are checked • Time grows linearly with test suite size

  12. Invariant Stability • Relationship between test size suite and invariants • Across test suites • Identical - invariant same between two test suites • Missing - invariant is present in one test suite, but not other • Different - invariant is different between two test suites • Interesting - Worthy of further study to determine relevance • Uninteresting - Peculiarity in the data • S1 in [ 0 … 98 ] (99 values) • S1 >= 0 (96 values)

  13. Invariant differences(2500-element test suite)

  14. Invariants and Program Correctness • Compare invariants detected across programs • Correct versions of programs have more invariants than incorrect ones • Examination of 424 intro C programs from U of Washington • Given # of students, amount of money, # of pizzas, calculates whether the students can afford the pizzas. • Chose eight relevant invariants • people – [1…50] • pizzas – [1…10] • pizza_price – {9,11} • excess_money – [0...40] • slices = 8 * pizza • slices = 0 (mod 8) • slices_per – {0,1,2,3} • slices_left  people - 1

  15. Relationship of Grade and Goal Invariants Invariants Detected

  16. Future Work (from 2001 paper) • Increasing Relevance • Invariant is relevant if it assists programmer • Repress invariants logically implied by others • Viewing and Managing Invariants • Overwhelming for a programmer to sort through • Various tools for selective reporting of invariants • Improving Performance • Balance between invariant quality and runtime • Number of Derived Variables used • Richer Invariants • Invariants over Pointer based data structures • Computing Conditional Invariants

  17. Resources • Daikon website • http://pag.csail.mit.edu/daikon/ • Contains links to • Papers • Source Code • User Manual • Developers Manual

  18. Questions???

More Related