600 likes | 793 Views
Delta Debugging and Model Checkers for fault localization. Amin Alipour. Note: Some slides/figures in this presentations has been used/adapted from presentations by Andreas Zeller, Tevfik Bultan , and Alex Groce . . Outline. Software Fault – some facts Delta debugging
E N D
Delta Debugging and Model Checkers for fault localization Amin Alipour Note: Some slides/figures in this presentations has been used/adapted from presentations by Andreas Zeller, TevfikBultan, and Alex Groce.
Outline • Software Fault – some facts • Delta debugging • Simplifying test cases • Isolating failure inducing parts in test cases • Search in space • Model checking • Background • Distance metrics • Conclusion
Software faults • Software fault/flaw/bug perturbs the state of a program to an error state. • Error state can propagates through the execution of the program and cause a failure. • Failure is manifestation of error.
Software debugging • What we have for debugging? • Program • Set of test cases. • … • For maintainable debugging of failures: • We need to understand the test case/failure. • We need to identify the location of faults. (Fault Localization) Can we automate it?
Approaches to Fault Localization • Program Slicing • Program Spectra • Statistical Reasoning • Delta Debugging • Model Checking
Delta Debugging • Goal: • Removing components irrelevant to the failure from test cases. • It can improve comprehension of the failure. • Delta debugging comes with two techniques: • Simplification (minimization) of test cases, and • Isolation of failure-inducing parts from test cases.
Delta Debugging • Failing test cases are usually cluttered by unnecessary/irrelevant things. ……. <td align=left valign=top><SELECT NAME="op sys" MULTIPLE SIZE=7><OPTION VALUE="All">All<OPTION VALUE="Windows 3.1"> Windows 3.1<OPTIONVALUE="Windows 95">Windows 95<OPTION VALUE="Windows 98">Windows 98<OPTION VALUE="Windows ME"> Windows ME<OPTION VALUE="Windows 2000">Windows2000<OPTION VALUE="Windows NT">Windows NT<OPTION VALUE="Mac System 7"> Mac System 7<OPTION VALUE="Mac System 7.5">Mac System 7.5<OPTION VALUE="MacSystem 7.6.1">Mac System 7.6.1<OPTION VAL UE="Mac System 8.0">Mac System8.0<OPTION VALUE="Mac System 8.5">Mac System 8.5<OPTION VALUE="Mac System8.6">Mac Syst em 8.6<OPTION VALUE="Mac System 9.x">Mac System 9.x<OPTIONVALUE="MacOS X">MacOS X<OPTION VALUE="Linux">Linux<OPTION VALUE="BSDI">BSDI<OPTION VALUE="FreeBSD">FreeBSD<OPTIONVALUE="NetBSD">NetBSD<OPTION VALUE="OpenBSD">OpenBSD<OPTION VALUE="AIX">AIX<OPTION VALUE="BeOS">BeOS <OPTION VALUE="HP-UX">HP-UX<OPTIONVALUE="IRIX">IRIX<OPTION VALUE="Neutrino ">Neutrino<OPTION VALUE="OpenVMS">OpenVMS<OPTION VALUE="OS/2">OS/2<OPTION VALUE="OSF/1">OSF/1<OPTION VALUE="Solaris" >Solaris<OPTIONVALUE="SunOS">SunOS<OPTION VALUE="other">other</SELECT></td><td align=left valign=top><SELECT NAME="p riority" MULTIPLE SIZE=7><OPTION VALUE="--">--<OPTION VALUE="P1">P1<OPTION VALUE="P2">P2<OPTIONVALUE="P3">P3<OPTION VALUE="P4">P4<OPTION VALUE="P5">P5</SELECT></td><td align=left valign=top><SELECT NAME="bug severity" MULTIPLE SIZE= 7><OPTION VALUE="blocker">blocker<OPTION VALUE="critical">critical<OPTIONVALUE="major">major<OPTION VALUE="normal"> normal<OPTIONVALUE="minor">minor<OPTION VALUE="trivial">trivial<OPTION VALUE="enhancement">enhancement</SELECT></tr> </table> …..
Simplification of test cases • Goal: • Minimizing the size of a failing test case, cF. • cF= 12 ... n • Minimizing test cases requires checking all subset of s. • Delta debugging simplifies a failing test case cF to a 1-minimal test case. • 1-minimal failing test case: • A failing test case is 1-minimal, if any part of it (i) is removed, the failure will disappear.
Simplification Algorithm i = cFi • Test each 1, 2, ... n and each 1, 2, ..., n • There are four possible outcomes • Some i causes failure • Partition i to two and continue with i as the test set • Some i causes failure • Continue with i as the test set with n 1 subsets • No test causes failure • Increase granularity by generating a partition with 2n subsets • The granularity can no longer be increased • Done, found the 1-minimal subset
Simplification- Example Granularity n = 2 n = 4 n = 3 n = 2 n = 4 n = 3
Simplification Example 2 1 <SELECT NAME="priority" MULTIPLE SIZE=7> F 2<SELECT NAME="priority" MULTIPLE SIZE=7>P 3<SELECT NAME="priority" MULTIPLE SIZE=7> P 4<SELECT NAME="priority" MULTIPLE SIZE=7>P 5<SELECT NAME="priority" MULTIPLE SIZE=7>F 6<SELECT NAME="priority" MULTIPLE SIZE=7>F 7 <SELECT NAME="priority" MULTIPLE SIZE=7> P 8<SELECT NAME="priority" MULTIPLE SIZE=7>P 9 <SELECT NAME="priority" MULTIPLE SIZE=7>P 10<SELECT NAME="priority" MULTIPLE SIZE=7>F 11<SELECT NAME="priority" MULTIPLE SIZE=7> P 12<SELECT NAME="priority" MULTIPLE SIZE=7>P 13<SELECT NAME="priority" MULTIPLE SIZE=7>P
Simplification Example 2-cont’d 14<SELECT NAME="priority" MULTIPLE SIZE=7>P 15<SELECT NAME="priority" MULTIPLE SIZE=7>P 16<SELECTNAME="priority" MULTIPLE SIZE=7>F 17<SELECT NAME="priority" MULTIPLE SIZE=7>F 18<SELECTNAME="priority" MULTIPLE SIZE=7>F 19<SELECTNAME="priority" MULTIPLE SIZE=7>P 20<SELECTNAME="priority" MULTIPLE SIZE=7>P 21<SELECTNAME="priority" MULTIPLE SIZE=7>P 22<SELECTNAME="priority" MULTIPLE SIZE=7>P 23<SELECTNAME="priority" MULTIPLE SIZE=7>P 24<SELECTNAME="priority" MULTIPLE SIZE=7>P 25<SELECT NAME="priority" MULTIPLE SIZE=7>P 26<SELECT NAME="priority" MULTIPLE SIZE=7>F
……. <td align=left valign=top><SELECT NAME="op sys" MULTIPLE SIZE=7><OPTION VALUE="All">All<OPTION VALUE="Windows 3.1"> Windows 3.1<OPTIONVALUE="Windows 95">Windows 95<OPTION VALUE="Windows 98">Windows 98<OPTION VALUE="Windows ME"> Windows ME<OPTION VALUE="Windows 2000">Windows2000<OPTION VALUE="Windows NT">Windows NT<OPTION VALUE="Mac System 7"> Mac System 7<OPTION VALUE="Mac System 7.5">Mac System 7.5<OPTION VALUE="MacSystem 7.6.1">Mac System 7.6.1<OPTION VAL UE="Mac System 8.0">Mac System8.0<OPTION VALUE="Mac System 8.5">Mac System 8.5<OPTION VALUE="Mac System8.6">Mac Syst em 8.6<OPTION VALUE="Mac System 9.x">Mac System 9.x<OPTIONVALUE="MacOS X">MacOS X<OPTION VALUE="Linux">Linux<OPTION VALUE="BSDI">BSDI<OPTION VALUE="FreeBSD">FreeBSD<OPTIONVALUE="NetBSD">NetBSD<OPTION VALUE="OpenBSD">OpenBSD<OPTION VALUE="AIX">AIX<OPTION VALUE="BeOS">BeOS <OPTION VALUE="HP-UX">HP-UX<OPTIONVALUE="IRIX">IRIX<OPTION VALUE="Neutrino ">Neutrino<OPTION VALUE="OpenVMS">OpenVMS<OPTION VALUE="OS/2">OS/2<OPTION VALUE="OSF/1">OSF/1<OPTION VALUE="Solaris" >Solaris<OPTIONVALUE="SunOS">SunOS<OPTION VALUE="other">other</SELECT></td><td align=left valign=top><SELECT NAME="p riority" MULTIPLE SIZE=7><OPTION VALUE="--">--<OPTION VALUE="P1">P1<OPTION VALUE="P2">P2<OPTIONVALUE="P3">P3<OPTION VALUE="P4">P4<OPTION VALUE="P5">P5</SELECT></td><td align=left valign=top><SELECT NAME="bug severity" MULTIPLE SIZE= 7><OPTION VALUE="blocker">blocker<OPTION VALUE="critical">critical<OPTIONVALUE="major">major<OPTION VALUE="normal"> normal<OPTIONVALUE="minor">minor<OPTION VALUE="trivial">trivial<OPTION VALUE="enhancement">enhancement</SELECT></tr> </table> ….. Simplification <SELECT>
Isolation of Failure-inducing part from test case • Even in minimal test cases, there are still some elements in the minimal test case that are not directly related to the failure. • E.g., a minimal test case for a C compiler, still needs to have some symbols like: {,}, or variable declarations for the validity of test input that might be irrelevant to the failure. #define SIZE 20 Double mult(double z[], int n) { inti, j; i = 0; for(j=0;j<n);j++){ i = i + j + 1; z[i] = z[i]*(z[0] + 1.0); } return z[n]; }
Isolation of Failure-inducing part from a test case • How to isolate failure-related parts? • Find a pair of passing and failing input that are very similar and contrast them. Passing Test Case Failing Test Case #define SIZE 20 Double mult(double z[], int n) { inti, j; i = 0; for(j=0;j<n);j++){ i + j + 1; z[i] = z[i]*(z[0] + 1.0); } return z[n]; } #define SIZE 20 Double mult(double z[], int n) { inti, j; i = 0; for(j=0;j<n);j++){ i = i + j + 1; z[i] = z[i]*(z[0] + 1.0); } return z[n]; }
Isolation Algorithm • Narrow down the gap between passing and failing test case, by removing their differences and making them more similar.
Isolation Example 2 <SELECT NAME="priority" MULTIPLE SIZE=7>F 4 <SELECT NAME="priority" MULTIPLE SIZE=7>F 7<SELECT NAME="priority" MULTIPLE SIZE=7>P 6<SELECT NAME="priority" MULTIPLE SIZE=7>P 5<SELECT NAME="priority" MULTIPLE SIZE=7>P 3<SELECT NAME="priority" MULTIPLE SIZE=7>P 1 <SELECT NAME="priority" MULTIPLE SIZE=7>P
Cause for a failure Can we use the isolation technique to find causes of the failure?
Cause Transitions Cause rp rf a l1 a l2 a li b L1+1 Cause Transition b lj c Lj+1
Discussion on delta debugging • It scales well. • It requires minimal information about the program and its specification. • There are several extensions to it: • Hierarchal Delta debugging • Isolating schedules in concurrent systems. • Isolating failure-inducing changes in repositories.
Model Checking Problem Satisfied Program/Model Model Checker Counter-example Specification/ assertions
Fault Localization with Model Checkers • Model Checkers can perform different queries on program paths and states. • These queries can be used for fault localization: • Contrasting • Distance Metrics • Max-SAT
Explanation with Distance Metrics • How it’s done: First, the program (P) and specification (spec) are sent to the model checker. Model checker P+spec
Explanation with Distance Metrics • How it’s done: The model checker finds a counterexample, C. Model checker C P+spec
Explanation with Distance Metrics • How it’s done: The explanation tool uses P, spec, and C to generate (via Bounded Model Checking) a formula with solutions that are executions of P that are not counterexamples Model checker C P+spec BMC/constraint generator
Explanation with Distance Metrics • How it’s done: Constraints are added to this formula for an optimization problem: find a solution that is as similar to C as possible, by the distance metric d. The formula + optimization problem is S Model checker C P+spec BMC/constraint generator S
Explanation with Distance Metrics • How it’s done: An optimization tool (PBS, the Pseudo-Boolean Solver) finds a solution to S: an execution of P that is not a counterexample, and is as similar as possible to C: call this execution -C Model checker C P+spec BMC/constraint generator S -C Optimization tool
Explanation with Distance Metrics Report the differences (s) between C and –C to the user: explanation and fault localization Model checker C P+spec C BMC/constraint generator s -C S -C Optimization tool
“SSA” Transformation int main () { int x, y; int z = y; if (x > 0) y--; else y++; z++; assert (y == z); } int main () { int x0, y0; int z0 = y0; y1 = y0 - 1; y2 = y0 + 1; guard1 = x0 > 0; y3 = guard1?y1:y2; z1 = z0 + 1; assert (y3 == z1); }
Transformation to Equations int main () { int x0, y0; int z0 = y0; y1 = y0 - 1; y2 = y0 + 1; guard1 = x0 > 0; y3 = guard1?y1:y2; z1 = z0 + 1; assert (y3 == z1); } (z0 == y0 y1 == y0 – 1 y2 == y0 + 1 guard1 == x0 > 0 y3 == guard1?y1:y2 z1 == z0 + 1 y3 == z1)
Transformation to Equations int main () { int x0, y0; int z0 = y0; y1 = y0 - 1; y2 = y0 + 1; guard1 = x0 > 0; y3 = guard1?y1:y2; z1 = z0 + 1; assert (y3 == z1); } (z0 == y0 y1 == y0 – 1 y2 == y0 + 1 guard1 == x0 > 0 y3 == guard1?y1:y2 z1 == z0 + 1 y3 == z1) Uninitialized variables in CBMC are unconstrained inputs.
Transformation to Equations int main () { int x0, y0; int z0 = y0; y1 = y0 - 1; y2 = y0 + 1; guard1 = x0 > 0; y3 = guard1?y1:y2; z1 = z0 + 1; assert (y3 == z1); } (z0 == y0 y1 == y0 – 1 y2 == y0 + 1 guard1 == x0 > 0 y3 == guard1?y1:y2 z1 == z0 + 1 y3 == z1) CBMC (1) negates the assertion
Transformation to Equations int main () { int x0, y0; int z0 = y0; y1 = y0 - 1; y2 = y0 + 1; guard1 = x0 > 0; y3 = guard1?y1:y2; z1 = z0 + 1; assert (y3 == z1); } (z0 == y0 y1 == y0 – 1 y2 == y0 + 1 guard1 == x0 > 0 y3 == guard1?y1:y2 z1 == z0 + 1 y3 != z1) (assertion is now negated)
Transformation to Equations int main () { int x0, y0; int z0 = y0; y1 = y0 - 1; y2 = y0 + 1; guard1 = x0 > 0; y3 = guard1?y1:y2; z1 = z0 + 1; assert (y3 == z1); } (z0 == y0 y1 == y0 – 1 y2 == y0 + 1 guard1 == x0 > 0 y3 == guard1?y1:y2 z1 == z0 + 1 y3 != z1) then (2) translates to SAT and usesa fast solver to find a counterexample
Execution Representation (z0 == y0 y1 == y0 – 1 y2 == y0 + 1 guard1 == x0 > 0 y3 == guard1?y1:y2 z1 == z0 + 1 y3 != z1) Remove the assertion to get an equation forany execution of the program
Execution Representation Counterexample (z0 == y0 y1 == y0 – 1 y2 == y0 + 1 guard1 == x0 > 0 y3 == guard1?y1:y2 z1 == z0 + 1 y3 != z1) x0 == 1 y0 == 5 z0 == 5 y1 == 4 y2 == 6 guard1 == true y3 == 4 z1 == 6 Execution represented by assignments toall variables in the equations
Execution Representation Passing Trace (z0 == y0 y1 == y0 – 1 y2 == y0 + 1 guard1 == x0 > 0 y3 == guard1?y1:y2 z1 == z0 + 1 y3 == z1) x0 == 0 y0 == 5 z0 == 5 y1 == 4 y2 == 6 guard1 == false y3 == 6 z1 == 6 Use the assertion to find a passing trace.
Execution Representation Counterexample Successful execution x0 == 1 y0 == 5 z0 == 5 y1 == 4 y2 == 6 guard1 == true y3 == 4 z1 == 6 x0 == 0 y0 == 5 z0 == 5 y1 == 4 y2 == 6 guard1 == false y3 == 6 z1 == 6 Execution represented by assignments toall variables in the equations
The Distance Metric d Counterexample Successful execution x0 == 1 y0 == 5 z0 == 5 y1 == 4 y2 == 6 guard1 == true y3 == 4 z1 == 6 x0 == 0 y0 == 5 z0 == 5 y1 == 4 y2 == 6 guard1 == false y3 == 6 z1 == 6 d = number of changes (s) between two executions
The Distance Metric d Counterexample Successful execution x0 == 1 y0 == 5 z0 == 5 y1 == 4 y2 == 6 guard1 == true y3 == 4 z1 == 6 x0 == 0 y0 == 5 z0 == 5 y1 == 4 y2 == 6 guard1 == false y3 == 6 z1 == 6 d = number of changes (s) between two executions
The Distance Metric d Counterexample Successful execution x0 == 1 y0 == 5 z0 == 5 y1 == 4 y2 == 6 guard1 == true y3 == 4 z1 == 6 x0 == 0 y0 == 5 z0 == 5 y1 == 4 y2 == 6 guard1 == false y3 == 6 z1 == 6 1 d = number of changes (s) between two executions
The Distance Metric d Counterexample Successful execution x0 == 1 y0 == 5 z0 == 5 y1 == 4 y2 == 6 guard1 == true y3 == 4 z1 == 6 x0 == 0 y0 == 5 z0 == 5 y1 == 4 y2 == 6 guard1 == false y3 == 6 z1 == 6 d = 3 d = number of changes (s) between two executions 3 is the minimum possible distance between thecounterexample and a successful execution
The Distance Metric d New SAT variables Counterexample x0 == 1 y0 == 5 z0 == 5 y1 == 4 y2 == 6 guard1 == true y3 == 4 z1 == 6 x0 == (x0 != 1) y0 == (y0 != 5) z0 == (z0 != 5) y1 == (y1 != 4) y2 == (y2 != 6) guard1 == !guard1 y3 == (y3 != 4) z1 == (z1 != 6) To compute the metric, add a new SATvariable for each potential
The Distance Metric d New SAT variables Counterexample x0 == 1 y0 == 5 z0 == 5 y1 == 4 y2 == 6 guard1 == true y3 == 4 z1 == 6 x0 == (x0 != 1) y0 == (y0 != 5) z0 == (z0 != 5) y1 == (y1 != 4) y2 == (y2 != 6) guard1 == !guard1 y3 == (y3 != 4) z1 == (z1 != 6) And minimize the sum of the variables(treated as 0/1 values): a pseudo-Boolean problem
Explanation with Distance Metrics CBMC Model checker C P+spec explain C BMC/constraint generator s -C S -C PBS Optimization tool
Discussion • Usefulness of Fault Localization Techniques • Effectiveness: • Precision: Low false negative • Informative-ness: Enough clue to make a fix or refute • Efficiency: • Performance: It should run within the budget constraints. • Scalability: Ability to run on real size programs. • Information Usage: Making the most of the information available.
Discussion Input Test Cases Suspicious components Fault Localization Program Specification Comments Development History Developers