400 likes | 727 Views
Automated Test Generation and Repair. Darko Marinov. Escuela de Verano de Ciencias Informáticas RÍO 2011 Rio Cuarto, Argentina February 14-19, 2011. Why Testing?. Goal: Increase software reliability Software bugs cost US economy $60B/year [NIST’02] Approach: Find bugs using testing
E N D
Automated Test Generationand Repair Darko Marinov Escuela de Verano deCiencias Informáticas RÍO 2011Rio Cuarto, Argentina February 14-19, 2011
Why Testing? • Goal: Increase software reliability • Software bugs cost US economy $60B/year [NIST’02] • Approach: Find bugs using testing • Estimated savings from better testing $22B/year • Challenge: Manual testing is problematic • Time-consuming, error-prone, expensive • Research: Automate testing • Reduce cost, increase benefit
Topics to Cover • Introduction: about bugs • Randoop: random generation of OO tests • Pex: dynamic symbolic generation of inputs • UDITA: generation of complex data inputs • ReAssert: repair of OO unit tests • JPF: systematic testing of Java code
Introduction • Why look for bugs? • What are bugs? • Where they come from? • How to find them?
Some Costly “Bugs” • NASA Mars space missions • Priority inversion (2004) • Different metric systems (1999) • BMW airbag problems (1999) • Recall of >15000 cars • Ariane 5 crash (1996) • Uncaught exception of numerical overflow • Sample Video • Your own favorite example?
Some “Bugging” Bugs • An example bug on my laptop • “Jumping” file after changing properties • Put a read-only file on the desktop • Change properties: rename and make not read-only • Your own favorite example? • What is important about software for you? • Correctness, performance, functionality
Terminology • Anomaly • Bug • Crash • Defect • Error • Failure, fault • Glitch • Hangup • Incorrectness • J…
Dynamic vs. Static • Incorrect (observed) behavior • Failure, fault • Incorrect (unobserved) state • Error, latent error • Incorrect lines of code • Fault, error
“Bugs” in IEEE 610.12-1990 • Fault • Incorrect lines of code • Error • Faults cause incorrect (unobserved) state • Failure • Errors cause incorrect (observed) behavior • Not used consistently in literature!
Correctness • Common (partial) properties • Segfaults, uncaught exceptions • Resource leaks • Data races, deadlocks • Statistics based • Specific properties • Requirements • Specification
RequirementsAnalysis DesignChecking ImplementationUnit Testing IntegrationSystem Testing MaintenanceVerification Traditional Waterfall Model
Phases (1) • Requirements • Specify what the software should do • Analysis: eliminate/reduce ambiguities, inconsistencies, and incompleteness • Design • Specify how the software should work • Split software into modules, write specifications • Checking: check conformance to requirements
Phases (2) • Implementation • Specify how the modules work • Unit testing: test each module in isolation • Integration • Specify how the modules interact • System testing: test module interactions • Maintenance • Evolve software as requirements change • Verification: test changes, regression testing
Testing Effort • Reported to be >50% of development cost [e.g., Beizer 1990] • Microsoft: 75% time spent testing • 50% testers who spend all time testing • 50% developers who spend half time testing
When to Test • The later a bug is found, the higher the cost • Orders of magnitude increase in later phases • Also the smaller chance of a proper fix • Old saying: test often, test early • New methodology: test-driven development(write tests before code)
Software is Complex • Malleable • Intangible • Abstract • Solves complex problems • Interacts with other software and hardware • Not continuous
Software Still Buggy • Folklore: 1-10 (residual) bugs per 1000 nbnc lines of code (after testing) • Consensus: total correctness impossibleto achieve for (complex) software • Risk-driven finding/elimination of bugs • Focus on specific correctness properties
Approaches for Finding Bugs • Software testing • Model checking • (Static) program analysis
Software Testing • Dynamic approach • Run code for some inputs, check outputs • Checks correctness for some executions • Main questions • Test-input generation • Test-suite adequacy • Test oracles
Other Testing Questions • Maintenance • Selection • Minimization • Prioritization • Augmentation • Evaluation • Fault Characterization • …
Model Checking • Typically hybrid dynamic/static approach • Checks correctness for “all” executions • Some techniques • Explicit-state model checking • Symbolic model checking • Abstraction-based model checking
Static Analysis • Static approach • Checks correctness for “all” executions • Some techniques • Abstract interpretation • Dataflow analysis • Verification-condition generation
Comparison • Level of automation • Push-button vs. manual • Type of bugs found • Hard vs. easy to reproduce • High vs. low probability • Common vs. specific properties • Type of bugs (not) found
Soundness and Completeness • Do we find all bugs? • Impossible for dynamic analysis • Are reported bugs real bugs? • Easy for dynamic analysis • Most practical techniques and tools are both unsound and incomplete! • False positives • False negatives
Analysis for Performance • Static compiler analysis, profiling • Must be sound • Correctness of transformation: equivalence • Improves execution time • Programmer time is more important • Programmer productivity • Not only finding bugs
Combining Dynamic and Static • Dynamic and static analyses equal in limit • Dynamic: try exhaustively all possible inputs • Static: model precisely every possible state • Synergistic opportunities • Static analysis can optimize dynamic analysis • Dynamic analysis can focus static analysis • More discussions than results
Current Status • Testing remains the most widely used approach for finding bugs • A lot of recent progress (within last decade) on model checking and static analysis • Model checking: from hardware to software • Static analysis: from sound to practical • Vibrant research in the area • Gap between research and practice
Topics Related to Finding Bugs • How to eliminate bugs? • Debugging • How to prevent bugs? • Programming language design • Software development processes • How to show absence of bugs? • Theorem proving • Model checking, program analysis
Our Focus: Testing • More precisely, recent research on automated test generation and repair • More info at CS527 from Fall 2010 • Recommended general reading for research • How to Read an Engineering Research Paperby William G. Griswold • Writing Good Software Engineering Research Papersby Mary Shaw (ICSE 2003) • If you have read that paper, read on another area
Writing Good SE Papers Overview • Motivation • Guidelines for writing papers for ICSE • Approach • Analysis of papers submitted to ICSE 2002 • Distribution across three dimensions • Question (problem) • Result (solution) • Validation (evaluation) • Results • Writing matters, know your conferences!
Randoop • Feedback-directed random test generationby Carlos Pacheco, Shuvendu K. Lahiri, Michael D. Ernst, and Thomas Ball(ICSE 2007) • (optional) Finding Errors in .NET with Feedback-directed Random Testing by Carlos Pacheco, Shuvendu K. Lahiri & Thomas Ball (ISSTA 2008) • Website: Randoop • Slides courtesy of Carlos Pacheco
Randoop Paper Overview • Problem (Question) • Generate unit tests (with high coverage?) • Solution (Result) • Generate sequences of method calls • Random choice of methods and parameters • Publicly available tool for Java (Randoop) • Evaluation (Validation) • Data structures (JPF is next lecture) • Checking API contracts • Regression testing (lecture next week)
Pex • Pex – White Box Test Generation for .NETby Nikolai Tillmann and Jonathan de Halleux (TAP 2008) • (optional) Moles: Tool-Assisted Environment Isolation with Closures by Jonathan de Halleux and Nikolai Tillmann (TOOLS 2010) • Websites: Pex, TeachPex • Slides courtesy of Tao Xie (and Nikolai Tillmann, Peli de Halleux, Wolfram Schulte)
Pex Paper Overview • Problem (Question) • Generate unit tests (with high coverage) • Solution (Result) • Describe test scenarios with parameterized unit tests (PUTs) • Dynamic symbolic execution • Tool for .NET (Pex) • Evaluation (Validation) • Found some issues in a “core .NET component”
UDITA • Test Generation through Programming in UDITAby Milos Gligoric, Tihomir Gvero, Vilas Jagannath, Sarfraz Khurshid, Viktor Kuncak, and Darko Marinov (ICSE 2010) • (optional) Automated testing of refactoring engines by Brett Daniel, Danny Dig, Kely Garcia, and Darko Marinov (ESEC/FSE 2007) • Websites: UDITA, ASTGen • Slides partially prepared by Milos Gligoric
UDITA Paper Overview • Problem (Question) • Generate complex test inputs • Solution (Result) • Combines filtering approach (check validity) and generating approaches (valid by construction) • Java-based language with non-determinism • Tool for Java (UDITA) • Evaluation (Validation) • Found bugs in Eclipse, NetBeans, javac, JPF...
ReAssert • ReAssert: Suggesting repairs for broken unit testsby Brett Daniel, Vilas Jagannath, Danny Dig, and Darko Marinov (ASE 2009) • (optional) On Test Repair using Symbolic Execution by Brett Daniel, Tihomir Gvero, and Darko Marinov (ISSTA 2010) • Website: ReAssert • Slides courtesy of Brett Daniel
ReAssert Paper Overview • Problem (Question) • When code evolves, passing tests may fail • How to repair tests that should be updated? • Solution (Result) • Find small changes that make tests pass • Ask the user to confirm proposed changes • Tool for Java/Eclipse (ReAssert) • Evaluation (Validation) • Case studies, user study, open-source evolution
Java PathFinder (JPF) • Model Checking Programsby W. Visser, K. Havelund, G. Brat, S. Park and F. Lerda (J-ASE, vol. 10, no. 2, April 2003) • Note: this is a journal paper, so feel free to skip/skim some sections (3.2, 3.3, 4) • Website: JPF • Slides courtesy of Peter Mehlitz and Willem Visser
JPF Paper Overview • Problem • Model checking of real code • Terminology: Systematic testing, state-space exploration • Solution • Specialized Java Virtual Machine • Supports backtracking, state comparison • Many optimizations to make it scale • Publicly available tool (Java PathFinder) • Evaluation/applications • Remote Agent Spacecraft Controller • DEOS Avionics Operating System