310 likes | 384 Views
Mining Behavior Graphs for “Backtrace” of Noncrashing Bugs. Chao Liu, Xifeng Yan, Hwanjo Yu, Jiawei Han University of Illinois at Urbana-Champaign Philip S. Yu IBM T. J. Watson Research Presented by: Chao Liu. Outline . Motivations Related Work Classification of Program Executions
E N D
Mining Behavior Graphs for “Backtrace” of Noncrashing Bugs Chao Liu, Xifeng Yan, Hwanjo Yu, Jiawei Han University of Illinois at Urbana-Champaign Philip S. Yu IBM T. J. Watson Research Presented by: Chao Liu
Outline • Motivations • Related Work • Classification of Program Executions • Extract “Backtrace” from Classification Dynamics • Case Study • Conclusions
Motivations • Software is full of bugs • Windows 2000, 35M LOC • 63,000 known bugs at the time of release, 2 per 1000 lines • Software failure costs • Ariane 5 explosion is due to “errors in the software of the inertial reference system” (Ariaen-5 flight 501inquiry board report http://ravel.esrin.esa.it/docs/esa-x-1819eng.pdf) • A study by the National Institute of Standards and Technology found that software errors cost the U.S.economy about $59.5 billion annuallyhttp://www.nist.gov/director/prog-ofc/report02-3.pdf • Testing and debugging are laborious and expensive • “50% of my company employees are testers, and the rest spends 50% of their time testing!” --Bill Gates, in 1995 Courtesy to CNN.com
Bug Localization • Automatically circle out the most suspicious places • Two kinds of bugs w.r.t. symptoms • Crashing bugs • Typical symptoms: segmentation faults • Reasons: memory access violations • Noncrashing bugs • Typical symptoms: smooth executions but unexpected outputs • Reasons: logic or semantic errors • An example
Running Example void subline(char *lin, char *pat, char *sub) { int i, lastm, m; lastm = -1; i = 0; while((lin[i] != ENDSTR)) { m = amatch(lin, i, pat, 0); if (m >= 0){ putsub(lin, i, m, sub); lastm = m; } if ((m == -1) || (m == i)){ fputc(lin[i], stdout); i = i + 1; } else i = m; } } void subline(char *lin, char *pat, char *sub) { int i, lastm, m; lastm = -1; i = 0; while((lin[i] != ENDSTR)) { m = amatch(lin, i, pat, 0); if ((m >= 0) && (lastm != m) ){ putsub(lin, i, m, sub); lastm = m; } if ((m == -1) || (m == i)){ fputc(lin[i], stdout); i = i + 1; } else i = m; } } • Subject program • replace: perform regular expression matching and substitutions • 563 lines of C code • 17 functions are involved • Execution behaviors • 130 out of 5542 test cases fail to give correct outputs • No incorrect executions incur segmentation faults • Debug method • Step-by-step tracing
Bug Localization via Backtrace • Backtrace for noncrashing bugs? • Major challenges • No abnormality is visible on the surface. • When and where the abnormality happens.
Outline • Motivations • Related Work • Classification of Program Executions • Extract “Backtrace” from Classification Dynamics • Case Study • Conclusions
Related Work • Crashing bugs • Memory access monitoring • Purify [HJ92], Valgrind [SN00], GDB … • Noncrashing bugs • Static program analysis • Traditional model checking • Model checking source code
Static Program Analysis • Methodology • Examine source code directly • Enumerate all the possible execution paths without running the program • Check user-specified properties, e.g. • free(p) …… (*p) • lock(res) …… unlock(res) • receive_ack() … … send_data() • Strengths • Check all possible execution paths • Problems • Shallow semantics • Properties should be directly mapped to source code structure • Tools • ESC [DRL+98], LCLint [EGH+94], ESP [DLS02], MC Checker [ECC00] … ×
Traditional Model Checking • Methodology • Model program computation as finite state machines • It is described with a particular description language • Exhaustively explore all the reachable states in checking desired or undesired properties • Strengths • Model deeper semantics • Naturally fit in checking event-driven systems, like protocols • Problems • Significant amount of manual efforts in modeling • State space explosion • Tools • SMV [M93], SPIN [H97], Murphi [DDH+92] …
Model Checking Source Code • Methodology • Execute the real program in a sandbox (e.g., virtual machine) • Manipulate event happenings, e.g., • Message incomings • Return value of memory allocation • Strengths • Less significant manual specification • Problems • Application restrictions, e.g., • Event-driven programs (still) • Clear mapping between source code and logic event • Tools • CMC [MPC+02], Verisoft [G97], Java PathFinder [BHP+-00] …
Summary of Related Work • In summary, • Semantic inputs are necessary • Program model • Properties to be checked (all three methods) • Restricted application domain • Event-driven model • Properties are also event-related.
Example Revisited void subline(char *lin, char *pat, char *sub) { int i, lastm, m; lastm = -1; i = 0; while((lin[i] != ENDSTR)) { m = amatch(lin, i, pat, 0); if (m >= 0){ putsub(lin, i, m, sub); lastm = m; } if ((m == -1) || (m == i)){ fputc(lin[i], stdout); i = i + 1; } else i = m; } } void subline(char *lin, char *pat, char *sub) { int i, lastm, m; lastm = -1; i = 0; while((lin[i] != ENDSTR)) { m = amatch(lin, i, pat, 0); if (m > 0){ putsub(lin, i, m, sub); lastm = m; } if ((m == -1) || (m == i)){ fputc(lin[i], stdout); i = i + 1; } else i = m; } } void subline(char *lin, char *pat, char *sub) { int i, lastm, m; lastm = -1; i = 0; while((lin[i] != ENDSTR)) { m = amatch(lin, i, pat, 0); if (m >= 0){ putsub(lin, i, m, sub); lastm = m; } if ((m == -1) || (m == i)){ fputc(lin[i], stdout); i = i + 1; } else i = m; } } void subline(char *lin, char *pat, char *sub) { int i, lastm, m; lastm = -1; i = 0; while((lin[i] != ENDSTR)) { m = amatch(lin, i, pat, 0); if ((m >= 0) && (lastm != m) ){ putsub(lin, i, m, sub); lastm = m; } if ((m == -1) || (m == i)){ fputc(lin[i], stdout); i = i + 1; } else i = m; } } • No memory violations • Not event-driven program • No explicit error properties
Outline • Motivations • Related Work • Classification of Program Executions • Extract “Backtrace” from Classification Dynamics • Case Study • Conclusions
Synopsis of Program Execution • Program behavior graphs • Function-level abstraction of program behaviors • Function calls and transitions • First-order sequential information about function interactions int main(){ ... A(); ... B(); } int A(){ ... } int B(){ ... C() ... } int C(){ ... }
Identification of Incorrect Executions • A two-class classification problem • Every execution gives one behavior graph • Edges and closed frequent subgraphs as features • Is classification useful? • Classification itselfdoes not work for bug localization • Classifier only labels each run as either correct or incorrect as a whole • It does not tell when and where abnormality happens • Observations • Good classifiers know the differences between correct and incorrect execution • Difference, a kind of abnormality? • Where and when does abnormality happens? • Incremental classification ?
Outline • Motivations • Related Work • Classification of Program Executions • Extract “Backtrace” from Classification Dynamics • Case Study • Conclusions
Incremental Classification • Classification works only when instances from two classes are different. • Precision as a measure of the difference. • Incremental classification • Observe accuracy dynamics
main main A E A E B C F B C F H D D G G Illustration: Precision Boost One Correct Execution One Incorrect Execution
Bug Relevance • Precision boost • For each function F: • Precision boost = Exit precision - Entrance precision. • Intuition & heuristics • Differences take place within the execution of F • Abnormality happens while F is in the stack • The larger the boost, the more likely F is relevant to the bug • Bug-relevant function
Outline • Related Work • Classification of Program Executions • Extract “Backtrace” from Classification Dynamics • Case Study • Conclusions
Case Study void subline(char *lin, char *pat, char *sub) { int i, lastm, m; lastm = -1; i = 0; while((lin[i] != ENDSTR)) { m = amatch(lin, i, pat, 0); if (m >= 0){ putsub(lin, i, m, sub); lastm = m; } if ((m == -1) || (m == i)){ fputc(lin[i], stdout); i = i + 1; } else i = m; } } void subline(char *lin, char *pat, char *sub) { int i, lastm, m; lastm = -1; i = 0; while((lin[i] != ENDSTR)) { m = amatch(lin, i, pat, 0); if ((m >= 0) && (lastm != m) ){ putsub(lin, i, m, sub); lastm = m; } if ((m == -1) || (m == i)){ fputc(lin[i], stdout); i = i + 1; } else i = m; } } • Subject program • replace: perform regular expression matching and substitutions • 563 lines of C code • 17 functions are involved • Execution behaviors • 130 out of 5542 test cases fail to give correct outputs • No incorrect executions incur segmentation faults • Task • Can we circle out the backtrace for this bug?
Outline • Motivations • Related Work • Classification of Program Executions • Extract “Backtrace” from Classification Dynamics • Case Study • Conclusions
Mining into Software and Systems? Conclusions • Identify incorrect executions from program runtime behaviors. • Classification dynamics can give away “backtrace” for noncrashing bugs without any semantic inputs. • Data mining can contribute to software engineering and system researches in general.
References • [DRL+98] David L. Detlefs, K. Rustan, M. Leino, Greg Nelson and James B. Saxe. Extended static checking, 1998 • [EGH+94] David Evans, John Guttag, James Horning, and Yang Meng Tan. LCLint: A tool for using specifications to check code. In Proceedings of the ACM SIG-SOFT '94 Symposium on the Foundations of Software Engineering, pages 87-96, 1994. • [DLS02] Manuvir Das, Sorin Lerner, and Mark Seigle. Esp: Path-sensitive program verication in polynomial time. In Conference on Programming Language Design and Implementation, 2002. • [ECC00] D.R. Engler, B. Chelf, A. Chou, and S. Hallem. Checking system rules using system-specic, programmer-written compiler extensions. In Proceedings of the Fourth Symposium on Operating Systems Design and Implementation, October 2000. • [M93] Ken McMillan. Symbolic Model Checking. Kluwer Academic Publishers, 1993 • [H97] Gerard J. Holzmann. The model checker SPIN. Software Engineering, 23(5):279-295, 1997. • [DDH+92] David L. Dill, Andreas J. Drexler, Alan J. Hu, and C. Han Yang. Protocol verication as a hardware design aid. In IEEE International Conference on Computer Design: VLSI in Computers and Processors, pages 522-525, 1992. • [MPC+02] Madanlal Musuvathi, David Y.W. Park, Andy Chou, Dawson R. Engler and David L. Dill. CMC: A Pragmatic Approach to Model Checking Real Code. In Proceedings of the fifth Symposium on Operating Systems Design and Implementation, 2002.
References (cont’d) • [G97] P. Godefroid. Model Checking for Programming Languages using VeriSoft. In Proceedings of the 24th ACM Symposium on Principles of Programming Languages, 1997 • [BHP+-00] G. Brat, K. Havelund, S. Park, and W. Visser. Model checking programs. In IEEE International Conference on Automated Software Engineering (ASE), 2000. • [HJ92] R. Hastings and B. Joyce. Purify: Fast Detection of Memory Leaks and Access Errors. 1991. in Proceeding of the fthe Winter 1992 USENIX Conference, pages 125-138. San Francisco, California • [SN00] Julian Seward and Nick Nethercote. Valgrind, an open-source memory debugger for x86-GNU/Linux http://valgrind.org/ • [LLM+04] Zhenmin Li, Shan Lu, Suvda Myagmar, Yuanyuan Zhou. CP-Miner: A Tool for Finding Copy-paste and Related Bugs in Operating System Code, in Proceeding of the 6th Symposium of Operating Systems Design and Implementation, 2004 • [LCS+04] Zhenmin Li, Zhifeng Chen, Sudarshan M. Srinivasan, Yuanyuan Zhou. C-Miner: Mining Block Correlations in Storage Systems. In proceeding of the 3rd usenix conferences on file and storage technologies, 2004
Q & A Thank You!