630 likes | 849 Views
Adaptive Random Test Case Prioritization. Speaker: Bo Jiang * Co-authors: Zhenyu Zhang * , W.K.Chan † , T.H.Tse * * The University of Hong Kong † City University of Hong Kong. Contents. Background Motivation Adaptive Random Test Case Prioritization Experiments and Results Analysis
E N D
Adaptive Random Test Case Prioritization Speaker: Bo Jiang* Co-authors: Zhenyu Zhang*, W.K.Chan†, T.H.Tse* *The University of Hong Kong †City University of Hong Kong
Contents • Background • Motivation • Adaptive Random Test Case Prioritization • Experiments and Results Analysis • Related Works • Conclusion & Future work
Regression Testing Techniques Obsolete Test Case Elimination Program P Test Suite T Test Suite T’ Test Suite T’ Test Suite T’ Test Suite T • Accounts for 50% of the cost of software maintenance. Test Case Reduction Test Case Augmentation Test Case Selection Test Case Prioritization Program P’ Test Suite T’
Test Case Prioritization • Definition • Test case prioritization permutes a test suite T for execution to meet a chosen testing goal. • Typical testing goals • Rate of code coverage • Rate of fault detection • Rate of requirement coverage • Merits • No impact on the fault detection ability
Coverage-based Test Case Prioritization Technique • Total-statement/function/branch • Highest code coverage first • Resolve tie-case randomly • Additional-statement/function/branch • Additional highest code coverage first • Reset when no more coverage can be achieved • Resolve tie-case randomly • Disadvantages • Hard to scale to larger programs
Contents • Background • Motivation • Adaptive Random Test Case Prioritization • Experiments and Results Analysis • Related Works • Conclusion & Future work
Problem With Total Techniques GREP FLEX APFD Elbaum et al. @ TSE 2002
Problem With Total(greedy) Techniques GREP FLEX APFD Total strategy may NOT be effective for real-life program Elbaum et al. @ TSE 2002
45 40 35 30 25 Time Used for Prioritization 20 15 10 5 0 1 2 3 4 5 6 Random Siemens Problems with Additional Techniques Total Siemens Total Unix Random Unix Additional Siemens Additional Unix
45 40 35 30 25 Time Used for Prioritization 20 15 10 5 0 1 2 3 4 5 6 Random Siemens Problems with Additional Techniques Additional Techniques may NOT be efficient for real-life programs. Total Siemens Total Unix Random Unix Additional Siemens Additional Unix
45 40 35 30 25 Time Used for Prioritization 20 15 10 5 0 1 2 3 4 5 6 Random Siemens Problems with Additional Techniques Can we find a prioritization techniques that is both effective and efficient for real life program? Total Siemens Total Unix Random Unix Additional Siemens Additional Unix
Adaptive Random Testing (ART) • Adaptive Random Testing (ART) • A technique for test case generation • Evenly spread randomly generated test cases across the input domain. • In empirical study, ART can detect failures using up to 50% fewer test cases than random testing.
Fixed-Sized-Candidate-Set ART Algorithm • Random generate a test case and execute it.
Fixed-Sized-Candidate-Set ART Algorithm • Randomly generate a set of candidate test cases.
Fixed-Sized-Candidate-Set ART Algorithm • For each candidate test case, find its nearest neighbor within the executed test cases.
Fixed-Sized-Candidate-Set ART Algorithm • Select the test case which has longest distance with its nearest neighbor and execute it.
Fixed-Sized-Candidate-Set ART Algorithm • Randomly generate a set of candidate test cases.
Fixed-Sized-Candidate-Set ART Algorithm • For each candidate test case, find its nearest neighbor within the executed test cases.
Fixed-Sized-Candidate-Set ART Algorithm • For each candidate test case, find its nearest neighbor within the executed test cases.
Fixed-Sized-Candidate-Set ART Algorithm • For each candidate test case, find its nearest neighbor within the executed test cases.
Fixed-Sized-Candidate-Set ART Algorithm • For each candidate test case, find its nearest neighbor within the executed test cases.
Fixed-Sized-Candidate-Set ART Algorithm • For each candidate test case, find its nearest neighbor within the executed test cases.
Fixed-Sized-Candidate-Set ART Algorithm • Select the test case which has longest distance with its nearest neighbor and execute it.
Fixed-Sized-Candidate-Set ART Algorithm • Repeat until a failure is encountered. X
Adaptive Random Testing (ART) • ART is based on the observation that failure turned to cluster across the input domain. • Intuitively, evenly spread the test case may increase the probability of exposing the first fault faster. • In test case prioritization, we also want to increase the rate of fault detection.
Use ART directly for test case prioritization? • The variety of black-box input information makes it hard to define a general distance metric. • Video streams • Images • Xml • … • The white-box coverage information of the previously executed test cases are readily available • Statement coverage • Branch coverage • Function coverage • And…
Distribution of Failures in Profile Space on LilyPond William Dickinson et al. @ FSE, 2001.
MDS Display of Distribution of Failures in Profile Space on LilyPond Failures tend to cluster together. William Dickinson et al. @ FSE, 2001.
MDS Display of Distribution of Failures in Profile Space on GCC William Dickinson et al. @ FSE, 2001.
Distribution of Failures in Profile Space on GCC Failures tend to cluster together. William Dickinson et al. @ FSE, 2001.
Use ART directly for test case prioritization? • The variety of black-box input information makes it hard to define a uniform distance metric. • Video streams • Images • Xml • … • The white-box coverage information of the previously executed test cases are readily available • Statement coverage • Branch coverage • Function coverage • … Why NOT use such low-cost white-box information to evenly spread test cases across the code coverage space?
Contents • Background • Motivation • Adaptive Random Test Case Prioritization • Experiments and Results Analysis • Related Works • Conclusion & Future work
Adaptive Random Test Case Prioritization • Generate candidate set • Random select a test case into the candidate set • If code coverage improve, continue; Otherwise, stop. • Merits: No magic number, non-parametric • Select the farthest candidate from the prioritized set • Distance between test cases • Distance between a candidate test case and the already prioritized test cases • Repeat until all test cases are prioritized
Adaptive Random Test Case Prioritization • How to measure the distance of test cases • Jaccard Distance • General distance metric for binary data • Can also use other distance metric for substitution. • How to select the test case from the candidate set that is farthest away from the already prioritized test cases? • Maximize the minimumdistance (maxmin for short) • Chen et al. @ ASIAN '04, LNCS 2004 • Maximize the average distance (maxavg for short) • Ciupa et al. @ ICSE 2008 • Maximize the maximum distance (maxmax for short)
Contents • Background • Motivation • Adaptive Random Test Case Prioritization • Experiments and Results Analysis • Related Works • Conclusion & Future Work
Research Questions • Do different levels of coverage information have significant impact on ART techniques? • Do different definitions of test set distances have significant impacts on ART techniques? • Are ART techniques efficient?
Experiment Setup • Dynamic coverage information collection • gcov tool • Effectiveness Metric • APFD: weighted average of the percentage of faults detected over the life of the suite • Process • For each of the 11 subject programs, randomly select 20 test suite, and repeat 50 times for each ART techniques.
Research Questions • Do different levels of coverage information have significant impact on ART techniques? • Do different definitions of test set distances have significant impacts on ART techniques? • Are ART techniques efficient?
Do different levels of coverage information have significant impact on ART techniques? • Fix the other variable: definitions of test set distances. • Perform multiple comparison between each pair of coverage information and gather the statistics.
Do different levels of coverage information have significant impact on ART techniques? • Fix the other variable: definitions of test set distances. • Perform multiple comparison between each pair of coverage information and gather the statistics. As confirmed by previous research: Branch > Statement > Function
Research Questions • Do different levels of coverage information have significant impact on ART techniques? • Branch > Statement > Function • Do different definitions of test set distances have significant impacts on ART techniques? • Is ART techniques efficient?
The Impact of Test Set Distance • Fix the other variable: definitions of coverage information • Perform multiple comparison between each pair of test set distance and gather the statistics.
The Impact of Test Set Distance • Fix the other variable: definitions of coverage information • Perform multiple comparison between each pair of test set distance and gather the statistics. Max-Min > Max-Avg≈ Max-Max
Best ART Technique ART-br-maxmin is the best ART prioritization Technique
Research Questions • Do different levels of coverage information have significant impact on ART techniques? • Branch > Statement > Function • Do different definitions of test set distances have significant impacts on ART techniques? • Max-Min > Max-Avg > Max-Max • How doesART-br-maxmincompare with greedy? • Is ART techniques efficient?
Multiple Comparisons for ART-br-maxmin on Siemens Only maginal difference difference between ART-br-maxmin and traditional coverage-based techniques, and it is not statistical significant.