1 / 20

Locating Causes of Program Failures

Locating Causes of Program Failures. Texas State University CS 5393 Software Quality Project Yin Deng. Topics. Introduction What is the problem? Overview of major solutions A Sample Failure Case Study Complexity and other issues Conclusion Related Material. Introduction.

ivar
Download Presentation

Locating Causes of Program Failures

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Locating Causes of Program Failures Texas State University CS 5393 Software Quality Project Yin Deng

  2. Topics • Introduction • What is the problem? • Overview of major solutions • A Sample Failure • Case Study • Complexity and other issues • Conclusion • Related Material

  3. Introduction • Locating Causes of Program Failures • Holger Cleve and Andreas Zeller • ICSE 2005, research papers on Fault Localization • Holger Cleve is one of the members in software engineering research group at Saarland University in Germany. • Andreas Zeller is a full professor and the chair of software engineering research group at Saarland University. His research in SE concerns especially the analysis of why large, complex software systems fail to work as they should. http://www.st.cs.uni-sb.de/~cleve/ http://www.st.cs.uni-sb.de/zeller/

  4. What’s the Problem? • Definitions • Failure: A program’s behavior doesn’t satisfy its requirement specification. • Fault / Infection: An incorrect intermediate state that may be entered during program execution. • Failure  Infection  Defect in code, but not vice versa. • Problem • Why does program fail? • How to find the defects that cause a software failure?

  5. Overview of major solutions • Searching in Space • Acrossa program state to find the infected variable(s), often among thousands. • Focus on the difference between the program states where the failure occurs, and the states where the failure does not occur. • Using Delta Debugging, those initial differences can be systematically narrowed down to a small set of variables. • Searching in Time • Search over millions of program states to find the moment when the defect was executed. • Focus on cause transitions (CTS)!

  6. Searching in Space • Compare the program states of a passing run r and a falling run r at a certain moment. • Of alldifferent states, only some may be relevantforthe failure. • How to find a subset of relevant variables that is as smallas possible? • Delta Debugging, which behaves very much like a binary search.

  7. Searching in Time • A cause transition is where a cause originates. It points to program codethat causes the transition and hence the failure. • During transitions, some variables cease to be a failure cause and other variables begin. • Cause transitions are not only good locations for fixes, they actually locate the defects that cause the failure.

  8. 26 int main(int argc, char *argv[]) 27 { 28 int i = 0; 29 int *a = NULL; 30 31 a = (int *)malloc((argc - 1) * sizeof(int)); 32 for (i = 0; i < argc - 1; i++) 33 a[i] = atoi(argv[i + 1]); 34 35 shell_sort(a, argc); 36 37 for (i = 0; i < argc - 1; i++) 38 printf("%d ", a[i]); 39 printf("\n"); 40 41 free(a); 42 return 0; 43 } 1 /* sample.c -- Sample C program */ 2 3 #include <stdio.h> 4 #include <stdlib.h> 5 6 static void shell_sort(int a[], int size) 7 { 8 int i, j; 9 int h = 1; 10 do { 11 h = h * 3 + 1; 12 } while (h <= size); 13 do { 14 h /= 3; 15 for (i = h; i < size; i++) 16 { 17 int v = a[i]; 18 for (j = i; j >= h && a[j - h] > v; j -= h) 19 a[j] = a[j - h]; 20 if (i != j) 21 a[j] = v; 22 } 23 } while (h != 1); 24 } 25 Example – Source Code

  9. Example – Running Result • A passing run r $ sample 9 8 7 7 8 9 • A falling run r $ sample 11 14 0 11 What’s wrong?

  10. Example – Searching in Space State differences between rand r. One of these differences causes sample to fail.

  11. Example – Searching in Space (cont.) • Procedures • Runs r up to Line 9 • Applies half of the differences on r • Resumes execution and determines the outcome. • Result • Line 9, a[2] being zero causes the sample failure. • What causes a[2] be zero?

  12. Example – Searching in Time

  13. Example – Searching in Time (cont.) • Procedures • Find an interval of matches to start with; • there was a cause transition between argc in step 1 and a[0] in Step 44; • Use Delta Debugging to find relevant variables between argc and a[0] (function calls are preferred), a[2] is isolated; • CTS : Step 26 (a[2] again); • CTS : Step 35 (v). • Result • argc  a[2] in Lines 32–35 (Steps 8–11); • a[2]  v in Line 17 (Step 29); • v a[0] in Line 21 (Step 36).

  14. Example – Debugging Result

  15. Case Study: The GCC Failure The program that crashes GCC

  16. Complexity • Searching in space • Best case: Delta Debugging needs 2s log ktest runs to isolate sfailure-inducing variables from kstate differences. • worst case is k2 + 3k • In practice, Delta Debugging is much more logarithmic than linear. • Searching in time • A simple binary search over nprogram steps, repeated for each cause transition. • For mcause transitions, we need m log nruns of Delta Debugging.

  17. Practical Issues • Accessing state • Currently using GDB, which is painfully slow; • More efficient ways need to be explored. • Capturing accurate states • Several heuristics are used to determine state transferring; • When such heuristics fail, the state cannot be transferred. • Incomparable states • When control flow reaches different points in r and r, the resulting states are not comparable, simply because the set of local variables is different. • Some efforts are required to determine when the control flows of r and r diverge and converge.

  18. Conclusion • Cause transitions locate the software defect that causes a given failure, performing twice as well as any other technique previously known. • The technique requires an automated test, a mean to observe and manipulate the program state, as well as at least one alternate passing test run. • The technique could be used as an add-on to running an automated test suite; we not only know thata test has failed, but also whyand whereit failed.

  19. Related Material • Isolating cause-effect chains from computer programs. • A. Zeller. In W. G. Griswold, editor, Proc. Tenth ACM SIGSOFT Symposium on the Foundations of Software Engineering (FSE-10), pages 1–10, Charleston, South Carolina, Nov. 2002. ACM Press. • Simplifying and isolating failure-inducing input. • A. Zeller and R. Hildebrandt. IEEE Transactions on Software Engineering, 28(2):183–200, Feb. 2002. • Visualizing memory graphs. • T. Zimmermann and A. Zeller. In S. Diehl, editor, Proc. of the International Dagstuhl Seminar on Software Visualization, volume 2269 of Lecture Notes in Computer Science, pages 191–204, Dagstuhl, Germany, May 2002. Springer-Verlag. • Why Programs Fail: A Guide to Systematic Debugging. • A. Zeller. Morgan Kaufmann Publisher, October, 2005. • ISBN 1558608664.

  20. Any Question?

More Related