220 likes | 242 Views
Study on time-aware test suite prioritization using knapsack solvers, evaluating efficiency and effectiveness for quicker error detection in software testing. Experimentation with various algorithms and metrics.
E N D
Efficient Time-Aware Prioritization with Knapsack Solvers Sara AlspaughKristen R. WalcottMary Lou Soffa University of Virginia Michael BelanichGregory M. Kapfhammer Allegheny College ACM WEASEL Tech
Test Suite Prioritization • Testing occurs throughout software development life cycle • Challenge: time consuming and costly • Prioritization: reordering the test suite • Goal: find errors sooner in testing • Doesn’t consider the overall time budget • Alternative: time-aware prioritization • Goal 1: find errors sooner in testing • Goal 2: execute within time constraint
Motivating Example Original test suite with fault information T1 4 faults 2 min. T3 2 faults 2 min. T4 6 faults 2 min. T2 1 fault 2 min. Assume: - Same execution time - Unique faults found Prioritized test suite T4 6 faults 2 min. T1 4 faults 2 min. T3 2 faults 2 min. T2 1 fault 2 min. Testing time budget: 4 minutes
n P n P ¤ c x · t t i i ¤ x i 1 i i t = i 1 m a x = x i i t i m a x i c i The Knapsack Problem for Time-Aware Prioritization Maximize: , where is the code coverage of test and is either 0 or 1. Subject to the constraint: where is the execution time of test and is the time budget.
The Knapsack Problem for Time-Aware Prioritization Assume test cases cover unique requirements. T1 4 lines 2 min. T2 1 line 2 min. Time Budget: 4 min. T3 2 lines 2 min. T4 5 lines 2 min. Total Value: Space Remaining: 5 2 min. 9 0 min. 0 4 min.
The Extended Knapsack Problem • Value of each test case depends on test cases already in prioritization • Test cases may cover same requirements T1 4 lines 2 min. T1 0 lines 2 min. T2 1 line 2 min. Time Budget: 4 min. T3 2 lines 2 min. T4 5 lines 2 min. Total Value: Space Remaining: 5 2 min. 7 0 min. 0 4 min. UPDATE
Goals and Challenges • Evaluate traditional and extended knapsack solvers for use in time-aware prioritization • Effectiveness • Coverage-based metrics • Efficiency • Time overhead • Memory overhead • How does overlapping code coverage affect results of traditional techniques? • Is the cost of extended knapsack algorithms worthwhile?
The Knapsack Solvers • Random: select tests cases at random • Greedy by Ratio: order by coverage/time • Greedy by Value: order by coverage • Greedy by Weight: order by time • Dynamic Programming: break problem into sub-problems; use sub-problem results for main solution • Generalized Tabular: use large tables to store sub-problem solutions
The Knapsack Solvers (continued) • Core: compute optimal fractional solution then exchange items until optimal integral solution found • Overlap-Aware: uses a genetic algorithm to solve the extended knapsack problem for time-aware prioritization
j k ³ ´ t t ¸ £ £ m a x m a x c c 1 2 h i T T t t [ ] 1 2 T 1 1 1 2 ¡ x n x ; : : : c c c T T x 1 ¸ 2 ¸ ¸ ; ; n i 1 : : : t t t 1 2 n The Scaling Heuristic • Order the test cases by their coverage-to-execution-time ratio such that: • If , then it is possible to find an optimal solution that includes . • Check the inequality for each test case until it no longer holds. • belong in the final prioritization.
Implementation Details TestSuite(T) Test Transformer Program Under Test (P) New TestSuite(T ’) Knapsack Solver CoverageCalculator • Knapsack Solver Parameters1. Selected Solver2. Reduction Preference3. Knapsack Size
Evaluation Metrics • Code coverage: Percentage of requirements executed when prioritization is run • Basic block coverage used • Coverage preservation: Proportion of code covered by prioritization versus code covered by entire original test suite • Order-aware coverage: Considers both the order in which test cases execute in addition to overall code coverage
Experiment Design • Goals of experiment: • Measure efficiency of algorithms and scaling in terms of time and space overhead • Measure effectiveness of algorithms and scaling in terms of three coverage-based metrics • Case studies: • JDepend • Gradebook • Knapsack Size • 25, 50, and 75% of execution time of original test suite
Summary of Experimental Results • Prioritizer Effectiveness: • Overlap-aware solver had highest overall coverage for each time limit • Greedy by Value solver good for Gradebook • All Greedy solvers good for JDepend • Prioritizer Efficiency: • All algorithms took small amount of time and memory except for Dynamic Programming, Generalized Tabular, and Core • Overlap-aware solver required hours to run • Generalized Tabular had prohibitively large memory requirements • Scaling heuristic reduced overhead in some cases
Conclusions • Most sophisticated algorithm not necessarily most effective or most efficient • Trade-off: effectiveness versus efficiency • Efficiency or effectiveness most important? • Effectiveness overlap-aware prioritizer • Efficiency low-overhead prioritizer • Prioritizer choice depends on test suite nature • Time versus coverage of each test case • Coverage overlap between test cases
Future Research • Use larger case studies with bigger test suites • Use case studies written in other languages • Evaluate other knapsack solvers such as branch-and-bound and parallel solvers • Incorporate other metrics such as APFD • Use synthetically generated test suites
Thank you! Questions? http://www.cs.virginia.edu/walcott/weasel.html