270 likes | 286 Views
This paper presents a statistical debugging algorithm for isolating multiple bugs in large-scale programs, outperforming previous algorithms. The algorithm selects and ranks bug predictors, allowing for better bug fixing. The experiments validate the algorithm and reveal bug frequencies and circumstances.
E N D
Scalable Statistical Bug Isolation Authors: B. Liblit, M. Naik, A.X. Zheng, A. Aiken, M. I. Jordan Presented by S. Li
Outline • Background • Definitions • Motivations • Contributions • Algorithm • Visualization the algorithm • Experiments • Personal Comments
Definitions • A bug predicate is denoted by P • A P is associated with a particular program point. For instance, If (ptr == NULL) , or int a = 10; • , if P1 , P2 are associated with the same program point. • P is observed to be true at least once when running R, denoted by R(P) = 1, otherwise R(P) = 0 • A bug is denoted by B (or b) • A bug profile is denoted by
Definitions (cont.) • A bug profile includes a set of failure runs, which share b as the cause. • more than one bug occur in failure runs • R(P) = 1 indicates that P is a bug predictor, Likely,
Motivations • A traditional technique involved in statistical bug prediction • Regularized logistic regression, which select predicates that best predicate outcome of every run • Scalability problems lie in large-scale programs
Motivations cont. • Scalability problems of Regularized logistic regression • The set P is logically redundant • It’s difficult to achieve the actual important predicates associated with specific bugs causing different failure.
Contributions • To Highlight Contributions • To propose a statistical debugging algorithm • To isolate bugs that includes multiple undiagnosed bugs • To perform better than earlier corresponding algorithms • To validate the algorithm by experiments • To reveal circumstances for bugs to happen as well as frequencies of failure runs
Statistical Debugging algorithm • To automatically isolating multiple bugs • To select S • To rank the predicators in S from the most to least important. • To let predicators in S and the associated metrics be available to help fix the most serious bugs
Statistical Debugging algorithm cont. • Steps: • Identify the most important bug B • Not bug B but a predicate P closely correlated with its bug profile • Fix B, and repeat • Simulating the program’s behavior without bug b
Statistical Debugging algorithm cont. • To identify the bug • To select predicates that are the most likely to correspond to its bug profile • P1,P2,P3, …, P, ranked in the order of importance • R(P) = 1 • Bug profiles , unknown size and membership
Statistical Debugging algorithm cont. • To repeat to fix bug B • To discard any run such that R(P) = 1 • To recursively apply the algorithm to the remaining runs • To prune P = {P1,P2,P3, …, PB} by: • Reducing the importance of predictors of B • Re-ranking predictors P, for instance, allowing other predicators to rise to the top in subsequent iterations.
Statistical Debugging algorithm cont. • To analyze simple codes to introduce equations in the algorithm. Consider the following C code: • f = …; Line (a) • if (f == NULL) { Line (b) • X = 0; Line (c) • *f; } Line (d) The bugs in this example is deterministic , because…
Statistical Debugging algorithm cont. • Non-deterministic bugs, considering the following codes, • f = …; Line (a) • if (f == NULL) { Line (b) • X = 0; Line (c) • if (….) f =.. // some valid pointer… • *f; } Line (d) The bugs in this example is non- deterministic With respect to (b)
Statistical Debugging algorithm cont. • The probability that P being true implies failure. F(P): the number of failing runs in which P is observed to be true. S(P): the number of successful runs in which P is observed to be true.
Statistical Debugging algorithm cont. • Failure(P) = 1.0, a bug is deterministic for P, equivalently, P is never observed to be true in a successful run, S(P)=0 • Failure(P) < 1.0, non-deterministic
Statistical Debugging algorithm cont. • Failure(P) is not enough, considering… • f = …; Line (a) • if (f == NULL) { Line (b) • X = 0; Line (c) • *f; } Line (d) • Failure(f == NULL) =1.0 , good • As well, Failure(x == 0) =1.0 , why? x==0 always true, only failures reach it
Statistical Debugging algorithm cont. • Thus, just because Failure(P) is high does not mean P is the cause of a bug, only means this predicate is checked on a path of failures. • In the case of (x==0), the condition causing failure is made earlier, e.g. (f == NULL)
Statistical Debugging algorithm cont. • It is introduced to address the issue. Not only by the chance that it implies failure, but also how much difference of the P is observed to be true vs. simply reaching it where the P is checked. • To eliminate the predicates irrelevant to the bug, like (x==0) in the above example
Statistical Debugging algorithm cont. • In the above example Failure(x==0)=Context(x==0)=1.0 and so Increase(x==0)=0; • Conclusion: a predicate P with • no useful for predication and be discarded.
Visualization of Algorithm • Thermometer is used for visualization of experiments • The length of the thermometer: # of runs where a predicate is observed. • Black band on the left: Context(P); red band: Increase (P); white band: # of successful runs; S(P)
Visualization of Algorithm cont. • It shows F(P) after discarded the negative increase(P) • The large white band reveals these predicates are non-deterministic. • The very narrow red band indicate that Increase scores are small. With high increase scores Super-bug predicate, combining Multiple bugs
Visualization of Algorithm cont. • The following suggestions of metric of predicates are made from the above observation
Experiments • To validate the statistic debugging algorithm in five case studies. • To determine how many runs needed , let importanceN(P) be the importance of P using N runs. So, Importance32,000(P) – ImportanceN(P)<0.2
Personal Comments • Likes • Well structure, problems addressed, then proposed solutions addressed, step by step • Using a real and simple example to explain problems and difficulties that lies in research • Giving statistical interpretation by visualization, using their observation to explain the abstract mathematic equations
Personal Comments • Dislikes • They do not mention whether their research could be extended for isolation potential bugs, e.g. bugs with less importance, which probably cause failure in future • The dark (red) band and grey (pink) dark band in pictures are not very clear if this paper is only white/ black.