190 likes | 320 Views
Pruning Dynamic Slices With Confidence. Presented by: David Carrillo. Original by: Xiangyu Zhang Neelam Gupta Rajiv Gupta The University of Arizona. Dynamic Slicing. …… 10. A = …... 20. B = …… 30. P = 31. If (P<0) { ...... 35. A = A + 1 36. } 37. B=B+1 …… 40. Error(A).
E N D
Pruning Dynamic Slices With Confidence Presented by: David Carrillo Original by: Xiangyu Zhang Neelam Gupta Rajiv Gupta The University of Arizona
Dynamic Slicing …… 10. A = …... 20. B = …… 30. P = 31. If (P<0) { ...... 35. A = A + 1 36. } 37. B=B+1 …… 40. Error(A) Dynamic slice is the set of statements that did affect the value of a variable at a program point for aspecific program execution. [Korel and Laski, 1988] Dynamic Slice (A@40) = {10, 30, 31, 35, 40}
Effectiveness of Dynamic Slicing Dynamic slicing is very effective in containing the faulty statement, however it usually produces over-sized slices -- [AADEBUG’05]. Problem: How to automatically prune dynamic slices? Many Approaches: This paper presents: Fine-grained pruning of a backward slice by using confidence analysis.
input0 input_x input2 predicate_x output0 output1 output2 output_x predicate_x Types of Evidence Used in Pruning Buggy Execution • Classical dynamic slicing algorithms investigate bugs through the evidence of thewrong output. • Literature contains use of many different types of evidence, this paper studies “Partially correct output”. • Benefits of more evidence • Narrow the search for faulty statement. • Broaden the applicability of the tool.
Fine-grained Pruning by Exploiting Correct Outputs • Correct outputs produced in addition to wrong output. • DS(Owrong) – DS (Ocorrect) are all the statements that create wrong output and not correct output. …… 10. A = 1 (Correct: A=3) …... 20. B = A % 2 …… 30. C = A + 2 …… 40. Print (B) 41. Print (C) DS(C@41)= {10, 30, 41} DS(B@40)= {10, 20, 40} DS(C@41)-DS(B@40) = {30,41} • What happens when a statement affects both correct and incorrect output?
n • Value produced at node n can reach only wrong output nodes ? n n • Value produced at node n can reach both the correct and wrong output nodes. • Should we include n in the slice? Confidence Analysis n • Value produced at n can reach only correct outputs There is no evidence of incorrectness of n. Therefore it cannot be in the slice. Confidence(n)=1 • There is no evidence that n is correct, so it should be in the pruned slice. Confidence(n)=0 Confidence(n)=?; 0 ≤ ? ≤ 1
Value(n) = b Value(n) = c n n Confidence Analysis • Range(n) is all values taken by n during the buggy run. Value(n) = a Range(n)={ a, b, c, d, e, f, g } • Alt(n) is a set of possible values of the variable defined by n, that when propagated through the • dynamic dependence graph, produce the same values for correct outputs. Alt(n)={ a } , c • When |Alt(n)|==1, we have the highest confidence (=1) on the correctness of n; • When |Alt(n)|==|Range(n)|, we have the lowest confidence (=0). • |Range(n)|>= |Alt(n)|>=1 Experimentally determined function.
Confidence Analysis: Example …… 10. A = ... …... 20. B = A % 2 …… 30. C = A + 2 …… 40. Print (B) 41. Print (C) A%2 is a one-to-many (2) mapping. A+2 is a one-to-one mapping.
Confidence Analysis: Two Problems • How to decide the Range of values for a node n? • Based on variable type (e.g., Integer). • Static range analysis. • Our choice: • Dynamic analysis based on value profiles (Range of values for a statement is the set of values defined by all of the execution instances of the statement during the program run). • How to compute Alt(n)? • Consider the set of correct output values as constraints. • Compute Alt(n) by backward propagation of constraints through the dynamic dependence subgraph corresponding to the slice.
(T,...)= (1,...)(3,...)(5,...)(8,...) (9,...) (Y,T)= (0,3) (0,9)(1,1) (2,5) (2,8) (X,T)= (6,5) (9,8) (10,9) Computing Alt(n) Along Data Dependence alt(S1) = alt(T@S2) ∩ alt (T@S3) = {9} S1: T=... 9 alt(T@S2)={9} alt(T@S3)={1,3,9} S2: X=T+1 10 S3: Y=T%3 0 alt(S2)={10} alt(S3)={0,1}
Computing Alt(n) Along Control Dependence alt(S1) = {True} S1: if (P) … True S2: X=T+1 10 S3: Y=T%3 0 alt(S2)={10} alt(S3)={0,1} (Y,T)= (0,3) (0,9)(1,1) (2,5) (2,8) (X,T)= (6,5) (9,8) (10,9)
Characteristics of Siemens Suite Programs • Each faulty version has a single manually injected error. • All the versions are not included: • No output is produced. • Faulty statement is not contained in the backward slice. • For each version three tests were selected.
On average, PDSmax = 41.1% of DS Results of Pruning
Confidence Based Prioritization DD – dependence distance CV – confidence values Executed statement instances examined (%) • Prior work have shown that Dependence Distance is an effective way to prioritize statements in order to locate faulty code. • Experimentation in this paper shows that Prioritizing by Confidence Values outperforms Dependence Distance.
The Potential of Confidence Analysis (1) • Interactive Pruning. Buggy Code Pruned Slices Dynamic Slicer With Confidence Input User Verified Statements as correct Incorporate user input into pruning.
The Potential of Confidence Analysis (2) Dynamic slices does not capture bugs where data dependency is incorrect due to incorrect control flow. • Relevant slicing (gzip v3 run r1) Potential dep. Data dep. Relevant slicing do, but generates Dynamic Slices that are too large. It may be effective with effective pruning.
Conclusions • Confidence analysis - exploits the correct output values produced in an execution to prune the dynamic slice of an incorrect output. • This novel dynamic analysis based implementation of confidence analysis effectively pruned backward dynamic slices in our experiments. • Pruned Slices = 41.1% Dynamic Slices, and still contain the faulty statement. • Our study shows that confidence analysis has additional applications beyond pruning – prioritization, interactive pruning & relevant slicing.
Discussion • Creation alternatives relies on known mapping for each type of statement. • i.e. X = Y + 1 is one-to-one. • i.e. X = X % 3 is one-to-many. • How extensible is this approach?? (floats, objects, etc.) • This approach assumes that: • There is only one error (detected). • The error is detected before it propagates into its dependencies. • How realistic are this assumptions in real scenarios with incomplete test coverage?