690 likes | 1.14k Views
Algorithms For Optimizing Test Cases. Presented by team 4 Jim Kile Don Little Samir Shah. Software Testing – So What?. July 28 1962 – Mariner I space probe Mission control destroys the rocket 1985-1987 – Therac-25 medical accelerator At least five patients die Others seriously injured
E N D
Algorithms For Optimizing Test Cases Presented by team 4 Jim Kile Don Little Samir Shah
Software Testing – So What? • July 28 1962 – Mariner I space probe • Mission control destroys the rocket • 1985-1987 – Therac-25 medical accelerator • At least five patients die • Others seriously injured • November 2000 – National Cancer Institute Panama City • At least eight patients die • Another 20 receive significant overdoses • Physicians indicted for murder
What Is Test Case Optimization? • Typically applies to unit test cases where coverage approaches 100% • Implies ordering execution such that: • Rate of fault detection is increased • Amount of time to perform regression is reduced • Elimination of unnecessary test cases during regression runs
What Are Goals For Test Case Execution? • Increase the rate of early fault detection and correction • Find bugs early so they can be corrected early • Regression test only those areas that have changed • Reduce the amount of time to execute full unit regression test suites
Why Is Test Case Optimization Important? • Execution complexity • Non-linear problem • Elapsed time to execute • Computing resources • Human resources • Wait time for test completion • Late defect detection and correction
Who Develops Test Cases? • Developers • Dedicated quality assurance staff • Automated test generation techniques
What Are Benefits Of Automated Test Generation Techniques? • No cognitive bias • Better able to: • Generate test cases that concentrate on error prone areas • Produce highly novel test cases • Better coverage overall
Why Automated Test Optimization? • Generating set of basic test cases - easy • Easily cover 50–70% of faults • Improving a test set’s quality - hard • Improving to 90–100% • Time consuming and expensive
What Is Difference Between Optimization And Test Case Generation? • Optimization problem • Single goal is sought • Test case generation • No single goal • Optimized coverage of code under test
What Are Benefits Of GA Vs. Random Test Generation? • Random test generation • Uniform distribution • GA-based search process • More focused test set • Set focuses on identified flaws • Some highly novel test cases
How Do Genetic Algorithms Work? • A GA operates on strings of digits called chromosomes • Each digit that makes up the chromosome is called a gene • Collection of such chromosomes makes up a population
How Do Genetic Algorithms Work? • Each chromosome has a fitness value • Fitness value determines probability of survival in next generation
How Do Genetic Algorithms Work? • Algorithm begins with random population • Algorithm evolves incrementally – generation • Produces a structure from iterative development • Reproduction • Combine with another chromosome (crossover) • Adjusted slightly (mutation) • Original chromosome may have a poor/ low fitness • Create offspring with much higher fitness
Uses of GA’s In Testing • Generate a range of effectivetest data with fault revealing power • Both papers used this technique • Introduce faults in software under test to determine effectives of test cases • Only one paper used this technique
How Is Quality Of Test Cases Evaluated? • Number of detected injected faults • Killed by the test case • Otherwise, it’s alive • Develop a score • Errors killed by test case • Divided by test set
How Does GA Work Specifically?Example • Genetic algorithm • Generates random population of binary digits • For example each chromosome may be 36 bits long • Each twelve bit segment representing one of the sides of a triangle • Chromosomes cross between gene 5 and 32
How Does GA Work Specifically?Crossover Before crossover between gene 5 and 32 1)111001110101 100101100110 001010111000 2)111101011010 100101101010 101110110100 After crossover 1)111001011010 100101101010 101110111000 2)111101110101 100101100110 001010110100
How Does Mutation Work Specifically? • Mutation will randomly switch genes in population • Gene 23 in chromosome 1 was switched from a 1 to 0
How Does Mutation Work? Before mutation of gene 23 chromosome 1 1)111001110101 100101100110 001010111000 After mutation 1)111001110101 100101100100 001010111000
Generation Of Next Population • Based on a “roulette wheel” • Where fitness determines the probability of selection • Those with higher fitness more chance of offspring in the next generation in comparison to their less fit companions
Dynamic Software Testing Techniques • Structural – first paper • Code coverage • Boundary conditions • Individual or combined statement traversal • Path coverage • Functional – second paper • Confirms that a function from specification is correctly implemented • No analysis of the structure of the program
First paper Automatic test case optimization: A bacteriologic algorithm Benoit Baudry, Franck Fleurey, Jean-marc Jézéquel And Yves Le Traon
Contribution • Finding an optimal set of test cases through revealing a test case’s “fault revealing power” • Building confidence in the test suite through “mutation analysis”
Bacteriologic AlgorithmTheoretical Basis • Adapted from genetic algorithms • Inspired by evolutionary ecology and bacteriologic adaptation • Similarities in this problem domain • Can’t generate a single perfect test suite
Bacteriologic AlgorithmBasic Functions • Initialization • Iterate incrementally creating new generation • Limitation • Only works on test cases of similar size
Bacteriologic AlgorithmInitialization • Initial test cases either written by hand or automatically generated • For the experiment test cases were randomly generated • Initial size set to 25 nodes
Bacteriologic AlgorithmComputing Fitness • Tool used to generate test case mutants • Uses the mutation score of a set of test cases as that set’s fitness function MS(T) = 100(d/(m - equiv)) • Test cases are executed to determine how many mutants they can kill • Global mutation score computed
Bacteriologic AlgorithmMemorization • Used to compute relative fitness • Test case mutation score relative to the solution set’s mutation score • Test cases are selected whose relative score exceeds the memorization threshold
Bacteriologic AlgorithmMutation • Randomly selects test cases • Selection is weighted by relative fitness of the test case • Selected cases and code are mutated to create new cases for the next generation • Code is represented by an abstract syntax tree • Nodes in the tree are replaced
Bacteriologic AlgorithmMutation – Abstract Syntax Tree • A finite, labeled directed tree • Nodes are labeled by operators • Edges represent operands • Leaves contain variables or constants • Used in a parser • Range of all possible structures defined by the syntax
Bacteriologic AlgorithmMutation – Abstract Syntax Tree Example x = a + b; y = a * b; while (y > a) { a++; x = a + b; }
Bacteriologic AlgorithmFiltering • Filtering = removing • Two different implementations • Delete any test case whose relative mutation score is equal to 0 • That is the function kills no mutant that the test cases in the solution set haven’t killed • Reduce the coverage matrix by deleting redundant test cases
Bacteriologic AlgorithmResults • Comparison with genetic algorithm • Both ran 50 times • Genetic algorithm results • 200 generations created • Average mutation score of 85 • Required executing an average of 480,000 test cases
Bacteriologic AlgorithmResults • Bacteriologic algorithm results • 30 generations created • Average mutation score of 96 • Required executing an average of 46,375 test cases
Second paper Breeding software test cases with genetic algorithms D. Berndt, J. Fisher, L. Johnson, J. Pinglikar And A. Watkins
Focus • Breeding software test cases using genetic algorithms as part of a software testing cycle • Uses automated test generation techniques • Evolving fitness function • Relies on fossil record of organisms • Search behaviors • Novelty • Proximity • Severity
Genetic AlgorithmSimple Triangle Classification Program (TRITYP) • Classify triangle by type • Three sides of the triangle • Parameters x, y, and z • Range 0 – 2000 • Search space • Illegal / legal triangle
Genetic AlgorithmApproach • Flaws were intentionally introduced into data for testing purposes • Errors introduced for specific ranges of x and y parameters • X coordinate between 500 - 1000 • Y coordinate between 0 – 500 • Result in error
Genetic AlgorithmHow does it work specifically? • Generates random population of x, y and z coordinates as binary digits • Each chromosome is 36 bits long • Each twelve bit segment representing one of the sides of a triangle
Genetic AlgorithmFitness • Relative fitness function • Compares • Particular chromosome’s fitness • Historical information from the fossil record
Genetic AlgorithmGenerating Software Test Cases • Variety of sources into test case breeding with genetic algorithms • Powerful evolutionary • Naturally parallel computational engine • Balance fitness with diversity • Wide variety of test cases can be bred • Concepts of novelty,proximity and severity • Used to create a relative or changing fitness function
Genetic AlgorithmBreeding Software Test Cases • Using genetic algorithms • Evolving fitness function • Fossil record of organisms • Interesting search behaviors • Novelty • Proximity • Severity
Genetic AlgorithmNovelty • Measure of the uniqueness of particular test case • Quantified by measuring distance in parameter space from previous invocations stored in the fossil record
Genetic AlgorithmProximity • Measure of closeness to other test cases that resulted in system failures
Genetic AlgorithmSeverity • Measure of the seriousness of a system error
Genetic AlgorithmDiversity • Used to avoid being trapped by local maxima • Generation of test cases diversity means • Emphasizing novelty • Downplaying proximity • Simple rules complex behavior • Explorers • Prospectors • Miners
Genetic AlgorithmExplorer • Highly novel test case • Spread across the lightly populated regions of the test space • Once an error is discovered - fitness function encourages more thorough testing of the region