260 likes | 471 Views
Investigating the Performance of Genetic Algorithm-Based Software Test Case Generation. Cristian Urs and Ben Riveira. Introduction. The article we chose focuses on improving the performance of Genetic Algorithms by:
E N D
Investigating the Performance of Genetic Algorithm-BasedSoftware Test Case Generation Cristian Urs and Ben Riveira
Introduction • The article we chose focuses on improving the performance of Genetic Algorithms by: • Use of predictive models to efficiently perform repetitive test case executions. • Directly improving the efficiency of the internal workings of the Genetic Algorithm itself.
The Genetic Algorithm, Defined • A GA is a search algorithm with the following key features: • A population of individuals, where each individual represents a possible solution to the problem. • A fitness function, which selects individuals for reproduction, based on the individual’s fitness. • Genetic operators, which crossover or mutate selected individuals, creating new individuals for testing.
Example GA Pseudocode • Choose the initial population of individuals. • Evaluate the fitness of each individual in that population. • Repeat on this generation until termination. • Select the best-fit individuals for reproduction. • Breed new individuals through crossover and mutation operations to give birth to offspring. • Evaluate the individual fitness of new individuals. • Replace least-fit population with new individuals.
Advantages of Genetic Algorithms • The population of a GA allows it to: • Explore a search space without completely losing partial solutions that have already been found. • Perform parallel searches into multiple regions of the solution space. • In the area of software verification and validation, GA’s have become useful for automatically generating large volumes of software test cases.
Improving the Performance of Genetic Algorithms Two Approaches
Neural Network-Based Oracles • Use of a system oracle • Avoids expensive execution costs for evaluating test cases. • Provides efficient execution of repetitive testing tasks after deployment. • Dramatically reduces the burden of evaluating test cases in each genetic algorithm generation.
Neural Network-Based Oracles Use of a System Oracle Input Domain Data Tester Selected Individual Input Input Genetic Algorithm Output Test Oracle Software Under Test Result Failure Intensity Evaluation Failed Test Cases
Neural Network-Based Oracles Use of a System Oracle Input Domain Data Tester Selected Individual Input Input Genetic Algorithm Output Test Oracle Software Under Test Result Failure Intensity Evaluation Failed Test Cases
Neural Network-Based Oracles • A neural network is an algorithm for optimization and learning based loosely on the nature of the brain. • A directed graph known as the network topology whose arcs we refer to as links. • A state variable and real-valued bias associated with each node. • Real-valued weight and bias associated with each link. • A transfer function for each node.
Neural Network-Based Oracles x y Input A simple Feed Forward Neural Network 1 1 1 1 1 2 1 1 -2 1 z Output 1
Neural Network-Based Oracles Input Domain Data Tester Selected Individual Input Input Genetic Algorithm Random Trained Oracle Output GA Trained Oracle Result Failure Intensity Evaluation Failed Test Cases
Improving the Fitness Function calculation • The Second strategy regarding the performance of genetic algorithm in automated test case generation is regarding the improvement of fitness function calculation.
Fitness Function Calculations • What is fitness? • The probability of survival of an individual chromosomes in the next generation • What is a chromosome? • Chromosome=string of digits • Gene= each digit that makes up the chromosome • Ex. of chromosome: 111001110101 100101100110 001010111000 1363 801 299 Ex. of utilization: this chromosome encodes the triangle sides values of x, y, z
How do we calculate the overall fitness? Based on: Likelihood of occurrence Failures intensity Similarity to other individuals from population Fitness Function Calculations
A. Likelihood of Occurrence • Highly fit individuals = high probability to be used • Poorly fit individuals = low probability to be used • How to calculate the likelihood of input data? • By multiplying the probabilities of occurrence • Ex: the likelihood that the user would select Input values 1 and Input value 3 is: 0.75 x 0.005=0.003
B. Failure Intensity • Combination between failure density and severity • Ex: • Low density, high severity- single failure that resulted in crash of the software • High density, low severity- system doesn’t crash, but gives erroneous output
C. Niche Size • What is niche? • the number of individuals from the population who have common attributes • A situation very likely to occur and result in high failure intensity • Situations which are similar, but different
How to improve fitness function calculation? 1. Use a sample of fossil record 2. Summarize the fossil record
1. Use a sample of fossil record • Fossil record= data warehouse • Advantages • Large reduction in computation time • Make the process predictable with fixed size samples • Easy to implement • Example • Sample A of size 500 (6% from the fossil size) • Sample B of size 5000 (17% from the fossil size)
2. Summarize the fossil record • Adopt a higher level of abstraction • Advantages: • Reduced and predictable computation time • Disadvantages: • The strategy is complex and requires frequent re-calculation
Conclusion (1) • The GA based software test case generation can be improved by using oracles or models and the way fitness function is calculated.
Conclusion (2) Though the methods for improving the performance of GA’s discussed in this paper sound feasible, not enough evidence was presented to corroborate any of the authors’ claims. Much of the information that was presented here was actually discovered in other articles, like: “Breeding Software Test Cases with Genetic Algorithms” by D. Berndt (2002) http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=1174917