190 likes | 310 Views
Lightweight Automated Testing with Adaptation-Based Programming. Alex Groce , Alan Fern, Jervis Pinto, Tim Bauer, Mohammad Amin Alipour , Martin Erwig and Camden Lopez Oregon State University. Part I: Lightweight Testing. API-Based Testing. Stateful software system
E N D
Lightweight Automated Testing with Adaptation-Based Programming Alex Groce, Alan Fern, Jervis Pinto, Tim Bauer, Mohammad Amin Alipour, Martin Erwig and Camden Lopez Oregon State University
API-Based Testing • Stateful software system • Various actions (e.g. method calls) that cause state to change • Some properties to check (e.g., at minimum, doesn’t crash) • Typical examples: container classes, file systems, databases, spacecraft command modules…
Need for Lightweight Methods • What is a lightweight automated testing method? • 1. Easy enough to implement that it is essentially available for all languages and environments: if it doesn’t exist, anyone can code it up in an afternoon
Need for Lightweight Methods • What is a lightweight automated testing method? • 2. Easy enough to use that any programmer interested in writing automated tests can quickly code up a harness for small, moderate complexity, modules
Need for Lightweight Methods • What is a lightweight automated testing method? • 3. Fastenough to produce results quickly, so automated testing can be pursued if useful and abandoned if not productive • And debugged if useful but not done right yet!
Need for Lightweight Methods • What is not a lightweight automated testing method? • Model checking and concolic testing typically fail all three tests, to date • In particular, they fail the first test (easy to implement) badly, and fail the second test (easy to use) much of the time
Typical Lightweight Automated Testing • The archetypal lightweight automated testing method is random testing
Random Test Harness public enumTestOp implements java.io.Serializable{ INSERT,REMOVE,FIND; public static final Set<TestOp> AllVals = unmodifiableSet(EnumSet.allOf(TestOp.class)); } for (inti = 0; i < NUM_ITERATIONS; i++) { SUT = new SplayTree(); // Create an empty container at beginning of each test case Oracle = new BinarySearchTree(); // Empty oracle container for (int j = 0; j < M; j++) { TestOp o = randomElement (TestOp.AllVals) TestVal v = randomElement (TestVals.AllVals); switch (o) { case INSERT: r1 = SUT.insert(v); r2 = Oracle.insert(v); break; case REMOVE: r1 = SUT.remove(v); r2 = Oracle.remove(v); break; case FIND: r1 = SUT.find(v); r2 = Oracle.find(v); break; } assert ((r1 == null && r2 == null) || r1.equals(r2)); // Behavior should match } } Test Engineer
Problems with Random Testing? • Works badly for, e. g. heap structures [Visser et al. ISSTA 2006, Sharma et al. FASE 2011] • Feedback can help, but if specialized to system, requires tons of engineer effort [Groce et al. ICSE 2007] • What if we used machine learning to learn feedback for each software system? • Reinforcement learning: system takes an action (chooses inputs), receives reward based on how well that choice performed; iterates and refines policy for making choices
Like Random Testing, but Different • Idea: replace calls to pseudorandom number generator with calls to library for reinforcement learning • Reward good tests to influence future policy choices: will start behaving like random testing, eventually do a kind of feedback • What rewards? Finding faults is too rare • Reward actions that improved total test coverage
ABP Test Harness public enumTestOp implements java.io.Serializable{ INSERT,REMOVE,FIND; public static final Set<TestOp> AllVals = unmodifiableSet(EnumSet.allOf(TestOp.class)); } AdaptiveProcess test = AdaptiveProcess.init(); HashSet<String> states = new HashSet<String>(); // Store all states visited Adaptive<String,TestOp>opChoice = test.initAdaptive(String.class,TestOp.class); Adaptive<String,TestVal>valChoice = test.initAdaptive(String.class,TestVal.class); for (inti = 0; i < NUM_ITERATIONS; i++) { SUT = new SplayTree(); // Create an empty container at beginning of each test case Oracle = new BinarySearchTree(); // Empty oracle container String context = SUT.toString(); // The state is simply a linearization of the SplayTree for (int j = 0; j < M; j++) { TestOp o = opChoice.suggest(context, TestOp.AllVals); // Used just like pseudo-random number generator TestVal v = valChoice.suggest(context, TestVal.AllVals).ordinal(); switch (o) { case INSERT: r1 = SUT.insert(v); r2 = Oracle.insert(v); break; case REMOVE: r1 = SUT.remove(v); r2 = Oracle.remove(v); break; case FIND: r1 = SUT.find(v); r2 = Oracle.find(v); break; } assert ((r1 == null && r2 == null) || r1.equals(r2)); // Behavior should match context = SUT.toString(); // Update the context if (!states.contains(context)) { // Is this a new state? states.add(context); test.reward(1000); // Good work, AdaptiveProcess test, you found a new state! } } test.endEpisode(); } Test Engineer
Problems with Reinforcement Testing? • We’re coupon collectors, not planners • Want to hit many different coverage targets • Not hit any particular target with minimum cost • RL assumes a stationary reward: • Hitting a coverage target “should be” worth as much the fifth time as the first time
Need to Adapt RL Algorithms • Use of off-the-shelf RL works as a good lightweight alternative to random testing • But maybe can do better with an algorithm tailored to the nature of software exploration • Need to adapt/create ML algorithms • Not just use off-the-shelf tools • We need more collaborations between verification experts and ML experts! • Paper has some ideas on where to go (e.g. MCTS)
Thank you! Questions?
How to Test? • Sequence of calls & checks • Generated how? • Model checking • Concolic testing • Also: is there a good model checker out there for Python? Concolic testing for Ruby? Does anyone have a good motivation to write one? How long would it take? • Model checking and concolic testing are too heavyweight for many purposes • Unfortunately: • Often hard to use, understand; • Not easy to scale, even for experts! • Fragile – once working on a complex codebase, often break with changes
How to Test? • Sequence of calls & checks • Generated how? • Model checking • Concolic testing • Also: is there a good model checker out there for Python? Concolic testing for Ruby? Does anyone have a good motivation to write one? How long would it take? • Model checking and concolic testing are too heavyweight for many purposes • Unfortunately: • Often hard to use, understand; • Not easy to scale, even for experts! • Fragile – once working on a complex codebase, often break with changes