1 / 40

Automated Test Generation and Repair

Automated Test Generation and Repair. Darko Marinov. Escuela de Verano de Ciencias Informáticas RÍO 2011 Rio Cuarto, Argentina February 14-19, 2011. Why Testing?. Goal: Increase software reliability Software bugs cost US economy $60B/year [NIST’02] Approach: Find bugs using testing

liang
Download Presentation

Automated Test Generation and Repair

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Automated Test Generationand Repair Darko Marinov Escuela de Verano deCiencias Informáticas RÍO 2011Rio Cuarto, Argentina February 14-19, 2011

  2. Why Testing? • Goal: Increase software reliability • Software bugs cost US economy $60B/year [NIST’02] • Approach: Find bugs using testing • Estimated savings from better testing $22B/year • Challenge: Manual testing is problematic • Time-consuming, error-prone, expensive • Research: Automate testing • Reduce cost, increase benefit

  3. Topics to Cover • Introduction: about bugs • Randoop: random generation of OO tests • Pex: dynamic symbolic generation of inputs • UDITA: generation of complex data inputs • ReAssert: repair of OO unit tests • JPF: systematic testing of Java code

  4. Introduction • Why look for bugs? • What are bugs? • Where they come from? • How to find them?

  5. Some Costly “Bugs” • NASA Mars space missions • Priority inversion (2004) • Different metric systems (1999) • BMW airbag problems (1999) • Recall of >15000 cars • Ariane 5 crash (1996) • Uncaught exception of numerical overflow • Sample Video • Your own favorite example?

  6. Some “Bugging” Bugs • An example bug on my laptop • “Jumping” file after changing properties • Put a read-only file on the desktop • Change properties: rename and make not read-only • Your own favorite example? • What is important about software for you? • Correctness, performance, functionality

  7. Terminology • Anomaly • Bug • Crash • Defect • Error • Failure, fault • Glitch • Hangup • Incorrectness • J…

  8. Dynamic vs. Static • Incorrect (observed) behavior • Failure, fault • Incorrect (unobserved) state • Error, latent error • Incorrect lines of code • Fault, error

  9. “Bugs” in IEEE 610.12-1990 • Fault • Incorrect lines of code • Error • Faults cause incorrect (unobserved) state • Failure • Errors cause incorrect (observed) behavior • Not used consistently in literature!

  10. Correctness • Common (partial) properties • Segfaults, uncaught exceptions • Resource leaks • Data races, deadlocks • Statistics based • Specific properties • Requirements • Specification

  11. RequirementsAnalysis DesignChecking ImplementationUnit Testing IntegrationSystem Testing MaintenanceVerification Traditional Waterfall Model

  12. Phases (1) • Requirements • Specify what the software should do • Analysis: eliminate/reduce ambiguities, inconsistencies, and incompleteness • Design • Specify how the software should work • Split software into modules, write specifications • Checking: check conformance to requirements

  13. Phases (2) • Implementation • Specify how the modules work • Unit testing: test each module in isolation • Integration • Specify how the modules interact • System testing: test module interactions • Maintenance • Evolve software as requirements change • Verification: test changes, regression testing

  14. Testing Effort • Reported to be >50% of development cost [e.g., Beizer 1990] • Microsoft: 75% time spent testing • 50% testers who spend all time testing • 50% developers who spend half time testing

  15. When to Test • The later a bug is found, the higher the cost • Orders of magnitude increase in later phases • Also the smaller chance of a proper fix • Old saying: test often, test early • New methodology: test-driven development(write tests before code)

  16. Software is Complex • Malleable • Intangible • Abstract • Solves complex problems • Interacts with other software and hardware • Not continuous

  17. Software Still Buggy • Folklore: 1-10 (residual) bugs per 1000 nbnc lines of code (after testing) • Consensus: total correctness impossibleto achieve for (complex) software • Risk-driven finding/elimination of bugs • Focus on specific correctness properties

  18. Approaches for Finding Bugs • Software testing • Model checking • (Static) program analysis

  19. Software Testing • Dynamic approach • Run code for some inputs, check outputs • Checks correctness for some executions • Main questions • Test-input generation • Test-suite adequacy • Test oracles

  20. Other Testing Questions • Maintenance • Selection • Minimization • Prioritization • Augmentation • Evaluation • Fault Characterization • …

  21. Model Checking • Typically hybrid dynamic/static approach • Checks correctness for “all” executions • Some techniques • Explicit-state model checking • Symbolic model checking • Abstraction-based model checking

  22. Static Analysis • Static approach • Checks correctness for “all” executions • Some techniques • Abstract interpretation • Dataflow analysis • Verification-condition generation

  23. Comparison • Level of automation • Push-button vs. manual • Type of bugs found • Hard vs. easy to reproduce • High vs. low probability • Common vs. specific properties • Type of bugs (not) found

  24. Soundness and Completeness • Do we find all bugs? • Impossible for dynamic analysis • Are reported bugs real bugs? • Easy for dynamic analysis • Most practical techniques and tools are both unsound and incomplete! • False positives • False negatives

  25. Analysis for Performance • Static compiler analysis, profiling • Must be sound • Correctness of transformation: equivalence • Improves execution time • Programmer time is more important • Programmer productivity • Not only finding bugs

  26. Combining Dynamic and Static • Dynamic and static analyses equal in limit • Dynamic: try exhaustively all possible inputs • Static: model precisely every possible state • Synergistic opportunities • Static analysis can optimize dynamic analysis • Dynamic analysis can focus static analysis • More discussions than results

  27. Current Status • Testing remains the most widely used approach for finding bugs • A lot of recent progress (within last decade) on model checking and static analysis • Model checking: from hardware to software • Static analysis: from sound to practical • Vibrant research in the area • Gap between research and practice

  28. Topics Related to Finding Bugs • How to eliminate bugs? • Debugging • How to prevent bugs? • Programming language design • Software development processes • How to show absence of bugs? • Theorem proving • Model checking, program analysis

  29. Our Focus: Testing • More precisely, recent research on automated test generation and repair • More info at CS527 from Fall 2010 • Recommended general reading for research • How to Read an Engineering Research Paperby William G. Griswold • Writing Good Software Engineering Research Papersby Mary Shaw (ICSE 2003) • If you have read that paper, read on another area

  30. Writing Good SE Papers Overview • Motivation • Guidelines for writing papers for ICSE • Approach • Analysis of papers submitted to ICSE 2002 • Distribution across three dimensions • Question (problem) • Result (solution) • Validation (evaluation) • Results • Writing matters, know your conferences!

  31. Randoop • Feedback-directed random test generationby Carlos Pacheco, Shuvendu K. Lahiri, Michael D. Ernst, and Thomas Ball(ICSE 2007) • (optional) Finding Errors in .NET with Feedback-directed Random Testing by Carlos Pacheco, Shuvendu K. Lahiri & Thomas Ball (ISSTA 2008) • Website: Randoop • Slides courtesy of Carlos Pacheco

  32. Randoop Paper Overview • Problem (Question) • Generate unit tests (with high coverage?) • Solution (Result) • Generate sequences of method calls • Random choice of methods and parameters • Publicly available tool for Java (Randoop) • Evaluation (Validation) • Data structures (JPF is next lecture) • Checking API contracts • Regression testing (lecture next week)

  33. Pex • Pex – White Box Test Generation for .NETby Nikolai Tillmann and Jonathan de Halleux (TAP 2008) • (optional) Moles: Tool-Assisted Environment Isolation with Closures by Jonathan de Halleux and Nikolai Tillmann (TOOLS 2010) • Websites: Pex, TeachPex • Slides courtesy of Tao Xie (and Nikolai Tillmann, Peli de Halleux, Wolfram Schulte)

  34. Pex Paper Overview • Problem (Question) • Generate unit tests (with high coverage) • Solution (Result) • Describe test scenarios with parameterized unit tests (PUTs) • Dynamic symbolic execution • Tool for .NET (Pex) • Evaluation (Validation) • Found some issues in a “core .NET component”

  35. UDITA • Test Generation through Programming in UDITAby Milos Gligoric, Tihomir Gvero, Vilas Jagannath, Sarfraz Khurshid, Viktor Kuncak, and Darko Marinov (ICSE 2010) • (optional) Automated testing of refactoring engines by Brett Daniel, Danny Dig, Kely Garcia, and Darko Marinov (ESEC/FSE 2007) • Websites: UDITA, ASTGen • Slides partially prepared by Milos Gligoric

  36. UDITA Paper Overview • Problem (Question) • Generate complex test inputs • Solution (Result) • Combines filtering approach (check validity) and generating approaches (valid by construction) • Java-based language with non-determinism • Tool for Java (UDITA) • Evaluation (Validation) • Found bugs in Eclipse, NetBeans, javac, JPF...

  37. ReAssert • ReAssert: Suggesting repairs for broken unit testsby Brett Daniel, Vilas Jagannath, Danny Dig, and Darko Marinov (ASE 2009) • (optional) On Test Repair using Symbolic Execution by Brett Daniel, Tihomir Gvero, and Darko Marinov (ISSTA 2010) • Website: ReAssert • Slides courtesy of Brett Daniel

  38. ReAssert Paper Overview • Problem (Question) • When code evolves, passing tests may fail • How to repair tests that should be updated? • Solution (Result) • Find small changes that make tests pass • Ask the user to confirm proposed changes • Tool for Java/Eclipse (ReAssert) • Evaluation (Validation) • Case studies, user study, open-source evolution

  39. Java PathFinder (JPF) • Model Checking Programsby W. Visser, K. Havelund, G. Brat, S. Park and F. Lerda (J-ASE, vol. 10, no. 2, April 2003) • Note: this is a journal paper, so feel free to skip/skim some sections (3.2, 3.3, 4) • Website: JPF • Slides courtesy of Peter Mehlitz and Willem Visser

  40. JPF Paper Overview • Problem • Model checking of real code • Terminology: Systematic testing, state-space exploration • Solution • Specialized Java Virtual Machine • Supports backtracking, state comparison • Many optimizations to make it scale • Publicly available tool (Java PathFinder) • Evaluation/applications • Remote Agent Spacecraft Controller • DEOS Avionics Operating System

More Related