310 likes | 430 Views
Has the Bug Really Been Fixed?. Zhongxian Gu , Earl T. Barr, David J. Hamilton, Zhendong Su University of California, Davis. ICSE 2010. Zhongxian GU. Publications: Has the bug really been fixed? Zhongxian Gu , Earl T. Barr, David J, Hamilton, Zhendong Su (ICSE 2010)
E N D
Has the Bug Really Been Fixed? Zhongxian Gu, Earl T. Barr, David J. Hamilton, Zhendong Su University of California, Davis ICSE 2010
Zhongxian GU Publications: Has the bug really been fixed? Zhongxian Gu, Earl T. Barr, David J, Hamilton, Zhendong Su (ICSE 2010) Effective Identification of Failure-Inducing Changes: A Hybrid Approach Sai Zhang , Yu Lin, Zhongxian Gu and Jianjun Zhao.(PASTE 2008), Change Impact Analysis for AspectJ Programs Sai Zhang , Zhongxian Gu, Yu Lin and Jianjun Zhao. (ICSM 2008), AutoFlow: An Automatic Debugging Framework for AspectJ Programs Sai Zhang , Zhongxian Gu, Yu Lin and Jianjun Zhao. (ISSTA 2008) Celadon: A Change Impact Analysis Tool for Aspect-Oriented Programs Sai Zhang , Zhongxian Gu, Yu Lin and Jianjun Zhao. (ICSE 2008) Authors
Zhendong Su Publications: Scalable and precise detection of buggy inconsistencies (OOPSLA'10) How unique is source code? (FSE'10) Perturbing numerical computation to detect instabilities (ISSTA'10) Dynamic detection of unsafe component loadings (ISSTA'10, Distinguished Paper Award) Has the bug really been fixed? (ICSE'10) Simultaneously learning and enforcing temporal properties (ICSE'10) Authors Current Projects: Automated debugging [ICSE'06, ASE'07] Clone detection and similarity checking [ICSE'07, FSE'07] Firewall modeling, analysis, and optimization [S&P'06, TNSM]Malicious code detection, analysis, and prevention [CCS'05, ASPLOS'06, ACSAC'06] Program analysis of numerical software [TACAS'04, TCS'05, ICSE'06] Web and database application security and reliability [ICSE'04, SAVCBS'04, POPL'06, PLDI'07] NEW: Please submit good papers to the following venues: TOSEM, SAS'11, ESEC/FSE'11,OOPSLA'11, ICSE'12, and ISSTA'12.
Fixing a Bug detect bug understand bug verify fix fix code detect f1 f2 … fn bad fixes MotivationApproach Implementation Evaluation
Do Bad Fixes Exist? • Empirical study • Explore Bugzilla databases of Ant, AspectJ and Rhino • Focus on “reopened” bugs • Study the comment histories • Bad fixes do exist! • Of reopened bugs, 66% in Ant, 73% in AspectJ, 80% in Rhino are due to bad fixes • “Oh, I am sorry, I didn’t consider that possibility.”
Example Bug(Rhino): Continuations do not work for __noSuchMethod__ // no idea what to do if it’s a TAIL_CALL if(fun instanceof NoSuchMethodShim && op != Icode_TAIL_CALL) { // get the shim and the actual method NoSuchMethodShim noSuchMethodShim = (NoSuchMethodShim) fun; ... } // no idea what to do if it’s a TAIL_CALL if(fun instanceofNoSuchMethodShim && op != Icode_TAIL_CALL) { If(fun instanceofNoSuchMethodShim) { // get the shim and the actual method NoSuchMethodShimnoSuchMethodShim = (NoSuchMethodShim) fun; ... if(op == Icode_TAIL_CALL) { ... } ... } 1st fix 2nd fix 3rd fix
The Bad Fix Problem input domain bug-triggering input domain known bug-triggering input Inputs covered by the fix • Coverage: Inputs in the domain are not covered • Disruption: Change behavior of unrelated inputs MotivationApproach Implementation Evaluation
Our Approach • Detect bad fix as soon as possible • Coverage • Discover the bug-triggering input domain • Test the fixed program using the bug-triggering input domain • Disruption • Regression testing • Random testing: use buggy program as the oracle
Discover the Bug-Triggering Input Domain • A known bug-triggering input induces a concrete path • Dijkstra’s weakest precondition (WP) • Path explosion • Loop invariants
Path Neighborhood • Intuition: paths in the neighborhood of the concrete buggy path are more error prone • Under-approximate the bug-triggering input domain via exploring neighboring paths
Distance-Bounded WP (WPd) • Inputs: program P, initial predicate ,a concrete path , distance budget d • Generate candidate paths • Restrict the computation of WP to the candidate paths C • Under-approximation of WP
Loop Invariants • Unroll the loop nodes • All paths are simple • Compute the distance • Compute the WP Unrolled-CFG …
Coverage Analysis buggy input WPd • Collect the concrete path • Under-approximate input domain using WPd • Symbolically execute the fixed program d … symbolic execution … buggy program fixed program
FIXATION Architecture fixed program CFG generator symbolic execution module Pb-CFG buggy program WPd module instrumentation module post-processor concrete path buggy input distance budget results MotivationApproach Implementation Evaluation
FIXATION Architecture • CFG generator: instrument WALA-CFG generator to support finitely-unrolled CFG generation fixed program CFG generator symbolic execution module Pb-CFG buggy program WPd module instrumentation module post-processor concrete path buggy input distance budget results
FIXATION Architecture • Instrumentation module: WALA-Shrike bytecode library fixed program CFG generator symbolic execution module Pb-CFG buggy program WPd module instrumentation module post-processor concrete path buggy input distance budget results
FIXATION Architecture • WPd module: implement a prototype fixed program CFG generator symbolic execution module Pb-CFG buggy program WPd module instrumentation module post-processor concrete path buggy input distance budget results
FIXATION Architecture • SE module & post-processor: Java Pathfinder fixed program CFG generator symbolic execution module Pb-CFG buggy program WPd module instrumentation module post-processor concrete path buggy input distance budget results
Evaluation • Objective • Demonstrate the feasibility of our approach • Differentiate WPd from WP • Experimental setup • Dell XPS 630i • 2.4GHz QuadCPU • 3.2 GB of memory • Ubuntu 8.04 MotivationApproachImplementationEvaluation
Program Transformation original program ... if( fun instanceofInterpretedFun) { ... return; } If( fun instanceof Continuation) { ... return; } If( fun instanceofIdFunctionObject){ ... return; } ... assert( false ); //should never execute return; simplified transformed buggy program if( fun ==InterpretedFun) { processInterFun(); return; } If( fun == Continuation) { processContinuation(); return; } If( fun ==IdFunctionObject){ processIdFunObj(); return; } assert( false ); //should never execute return;
Evaluation - Example simplified transformed buggy program if( fun ==InterpretedFun) { processInterFun(); return; } If( fun == Continuation) { processContinuation(); return; } If( fun ==IdFunctionObject){ processIdFunObj(); return; } ... assert( false ); //should never execute return; begin fun != idFunObject && fun != Continuation && fun != InterpretedFun fun == InterpretedFun fun != idFunObject && fun != Continuation call nodes fun == Continuation fun != idFunObject call nodes fun == IdFunObject call nodes true pass assert(false) fail known bug-triggering input: fun = NoSuchMethodShim distance budget: d = 0 end
Evaluation - Example WPd = (fun != InerpretedFun)&&(fun != Continuation) && (fun != IdFunctionObject) simplified transformed fixed program if( fun ==InterpretedFun) { ... } If( fun == Continuation) { ... } If( fun ==IdFunctionObject){ ... } If( fun ==NoSuchMethodShim && op != Icode_TAIL_CALL) { ... return(); } ... assert( false ); //should never execute FIXATION Bad fix! Assertion fails again. New bug-triggering input is: fun == NoSuchMethodShim && op == Icode_TAIL_CALL
WPd vs. WP (cont.) 1021 Feasible paths 234 detect bad fix number of paths explored 81 83 distance budget PathExp
Conclusion • Introduce and formalize the bad fix problem • Propose distance-bounded WP • Implement a prototype FIXATION
No Bad Fixes! detect f1 f2 … fn
Soundness and Completeness • Under-approximation of real bug-triggering input domain • Sound: every bad fix we detect is a real bad fix. • Not complete: we may miss some bad fixes.
Threats to Validity • Determine distance budget (d) • Fixation is currently not optimized • Benchmark pickup • Suffer the limitation of WP computation and symbolic execution components
Strength & Weakness • How to model bug as assertion failure • Difficult to Determine distance budget • Path close to a buggy-path are more likely to be error prone. • Although not complete, all detected bad fixes are bad.
References [1] J. Anvik, L. Hiew, and G. C. Murphy. Who should fix this bug? In ICSE ’06: Proceedings of the 28th international conference on Software engineering, 2006. [6] S. Chandra, S. J. Fink, and M. Sridharan. Snugglebug: a powerful approach to weakest preconditions. In PLDI ’09: Proceedings of the 2009 ACM SIGPLAN conference on Programming language design and implementation, volume 44, 2009. [12] J. Dolby, M. Vaziri, and F. Tip. Finding bugs efficiently with a SAT solver. In ESEC-FSE ’07: Proceedings of the the 6th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering, 2007. [29] S. Person, M. B. Dwyer, S. Elbaum, and C. S. Pˇasˇareanu. Differential symbolic execution. In SIGSOFT ’08/FSE-16: Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of software engineering, 2008.