260 likes | 367 Views
Dynamic Analysis of Algebraic Structure to Optimize Test Generation and Test Case Selection. Anthony J H Simons and Wenwen Zhao. Overview. Lazy Systematic Unit Testing JWalk testing concept and methodology The JWalk 1.0 toolset JWalkTester, JWalkUtility, JWalkEditor , etc.
E N D
Dynamic Analysis of Algebraic Structure to Optimize Test Generation and Test Case Selection Anthony J H Simons and Wenwen Zhao
Overview • Lazy Systematic Unit Testing • JWalk testing concept and methodology • The JWalk 1.0 toolset • JWalkTester, JWalkUtility, JWalkEditor, etc. • Dynamic analysis and pruning • extending earlier work to full algebraic analysis • Comparison and evaluation • measure path pruning, before and after • test result prediction, before and after http://www.dcs.shef.ac.uk/~ajhs/jwalk/
Motivation • State of the art in agile testing • Test-driven development is good, but… • …no specification to inform the selection of tests • …manual test-sets are fallible (missing, redundant cases) • …reusing saved tests for conformance testing is fallible –state partitions hide paths, faults (Simons, 2005) • Lazy systematic testing method: the insight • Complete testing requires a specification (even in XP!) • Infer an up-to-date specification from a code prototype • Let tools handle systematic test generation and coverage • Let the programmer focus on novel/unpredicted results
Lazy Systematic Unit Testing • Lazy Specification • late inference of a specification from evolving code • semi-automatic, by static and dynamic analysis of codewith limited user interaction • specification evolves in step with modified code • Systematic Testing • bounded exhaustive testing, up to the specification • emphasis on completeness, conformance, correctnessproperties after testing, repeatable test quality http://en.wikipedia.org/wiki/Lazy_systematic_unit_testing
JWalk Tester JWalk Utility JWalk Editor JWalk Marker JWalk Grapher JWalk SOAR JWalk 1.0 Toolset
JWalk Tester • Lazy systematic unit testing for Java • static analysis - extracts the public API of a compiled Java class • protocol walk (all paths) – explores, validates all interleaved methods to a given path depth • algebra walk (memory states) – explores, validates all observations on all mutator-method sequences • state walk (high-level states) – explores, validates n-switch transition cover for all high-level states http://www.dcs.shef.ac.uk/~ajhs/jwalk/ Try me
Baseline Approaches • Breadth-first generation • all constructors and all interleaved methods (eg JCrasher, DSD-Crasher, Jov) • generate-and-filter (eg Rostra, Java Pathfinder) by state equivalence class • Computational cost • exponential growth, memory issues, wasteful over-generation, even if filtering is later applied • #paths = Σc.mk,for k = 0..nKey: c = #constructors, m = #methods, k = depth
Dynamic Pruning • Interleaved analysis • generate-and-evaluate, pruning active paths on the fly (eg JWalk, Randoop) • remove redundant prefix paths after each test cycle, don’t bother to expand in next cycle • Increasing sophistication • prune prefix paths ending in exceptions (fail again) • JWalk, Randoop (2007) • and prefixes ending in algebraic observers (unchanged) • JWalk 0.8 (2007) • and prefixes ending in algebraic transformers (reentrant) • JWalk 1.0 (2009)
Prune Exceptions… Prune error-prefixes (JWalk0.8, Randoop) push push top top push push top pop top pop top push push new push top pop pop top pop top push pop pop pop top pop Key: novel state exception
…and Observers push Prune error- and observer-prefixes (JWalk0.8) top push push top pop top push push new top pop top pop push pop pop top pop Key: novel state exception unchanged state
Algebraic Pruning Prune error-, observer- and transformer-prefixes (JWalk1.0) pop push top top push push new top push top pop pop pop Key: novel state exception unchanged state reentrant state
What is the Same State? • Some earlier approaches • distinguish observers, mutators by signature (Rostra) • intrusive state equality predicate methods (ASTOOT) • external (partial) state equality predicates (Rostra) • subsumption of execution traces in JVM (Pathfinder) • Some algebraic approaches • shallow, deep equality under all observers (TACCLE) • but assumes observations are also comparable • very costly to compute from first principles • serialise object states and hash (Henkel & Diwan) • but not all objects are serialisable • no control over depth of comparison
Smart State Inspection • Reflection-and-hash • extract state vector from objects • compute hash code for each field • order-sensitive combination hash code • Proper depth control • shallow or deep equality settings, to chosen depth • hash on pointer, or recursively invoke algorithm • Fast state comparison • each test evaluation stores posterior state code • fast comparison with preceding, or all prior states • possible to detect unchanged, or reentrant states
Stack baseline except. observ. algebr. 0 1 1 1 1 1 7 7 7 7 2 43 31 13 13 3 259 139 25 19 4 1555 667 43 25 5 9331 3391 79 31 Pruning: Stack Table 1: Cumulative paths explored after each test cycle Pruned: 9,300 redundant pathsRetained: 31 significant paths (best 0.33%)
ResBook baseline except. observ. algebr. 0 1 1 1 1 1 9 9 9 9 2 73 73 25 25 3 585 561 49 33 4 4681 4185 97 41 5 37449 memex 169 41 Pruning: Reservable Book Table 2: Cumulative paths explored after each test cycle Pruned: 37,408 redundant pathsRetained: 41 significant paths (best 0.12%)
Test Result Prediction • Semi-automatic validation • the user confirms or rejects key results • these constitute a test oracle, used in prediction • eventually > 90% test outcomes predicted • JWalk test result prediction rules • eg: predict repeat failure • new().pop().push(e) == new().pop() • eg: predict same state • target.size().push(e) == target.push(e) • eg: predict same result • target.push(e).pop().size() == target.size() Try me
Kinds of Prediction • Strong prediction • From known results, guarantee further outcomes in the same equivalence class • eg: observer prefixes empirically checked before making any inference, unchanged state is guaranteed • target.push(e).size().top() == target.push(e).top() • Weak prediction • From known facts, guess further outcomes; an incorrect guess will be revealed in the next cycle • eg: methods with void type usually return no result, but may raise an exception • target.pop()predicted to have no result • target.pop().size() == -1reveals an error
Test Confirmation – JWalk 0.8 push Confirm all observations, errors on all state-modifying paths top push push top pop top push push new top pop top pop push pop pop top pop Key: confirm result confirm error predicted result
Test Confirmation – JWalk 1.0 Confirm all observations, errors on all primitive algebraic constructions pop push top top push push new top push top pop pop pop Key: confirm result confirm error predicted result
Stack v0.8 alg v0.8 pro v1.0 alg v1.0 pro 0 1 - 1 - 1 5 - 5 - 2 4 - 4 - 3 9 - 4 - 4 12 - 4 +4 5 26 - 4 +8 Total 57 57 22 34 Confirmations: Stack Table 3: Confirmations per test cycle (new oracle) JWalk 0.8: trained oracle after 57 confirmationsJWalk 1.0: trained oracle after 34 confirmations
ResBook v0.8 alg v0.8 pro v1.0 alg v1.0 pro 0 1 - 1 - 1 2 - 2 - 2 8 - 8 - 3 12 - 6 - 4 30 - 6 +20 5 40 memex - memex Total 93 93 23 43 Confirmations: Reservable Book Table 4: Confirmations per test cycle (inherited oracle) JWalk 0.8: trained oracle after 93 confirmationsJWalk 1.0: trained oracle after 43 confirmations
Why Residual Confirmations? • Prediction based on state equality • from state equivalence: • target.push(e).pop() == target • predict identical observations: • target.push(e).pop().size() == target.size() • Novel states occur in longer protocols • JWalk has deterministic argument synthesis: • elements generated in order: e1, e2, … en • algebraic reduction yields a novel state: • target.push(e1).pop().push(e2) == target.push(e2) • target.push(e2) != target.push(e1) from the oracle
Conclusions … • Test path pruning • algebraic analysis effective at eliminating redundant paths • absolutely necessary when testing classes with large APIs • java.lang.Character: c = 1, m = 78; d3 base = 480,715 paths; alg = 79 paths, stable after 1 cycle • java.lang.String: c = 13, m = 64; d3 base = 54,093 paths; alg = 845 paths, stable after 1 cycle • More test automation • presents user with the ideal mimimal test-set for judgement • user only has to confirm all errors and observations on all primitive algebraic constructions
Conclusions • Faster state exploration • algebra-walking finds the leaves of the algebra-tree faster • state-walking discovers high-level states faster, by growing only primitive state-modifying paths • can afford to search to greater test depths • Test result prediction • algebraic anaylsis improves predictive power as expected • but oracle must also have the reduction (and may not) • future idea: axiom generalisation? (Henkel & Diwan)
Thank You! • And thanks also to: • Wenwen Zhao – hashing on states for comparison • Mihai-Gabriel Glont – prototype UI for JWalkTester • Arne-Michael Toersel – case studies for JWalk Let’s go JWalking! http://www.dcs.shef.ac.uk/~ajhs/jwalk/
LibBook v0.8 alg v0.8 pro v1.0 alg v1.0 pro 0 1 - 1 - 1 2 - 2 - 2 3 - 3 - 3 2 - - - 4 3 - - +3 5 2 - - - Total 13 13 6 9 Confirmations: Library Book Table 5: Confirmations per test cycle (new oracle) JWalk 0.8: depth-5 oracle after 13 confirmationsJWalk 1.0: depth-5 oracle after 9 confirmations