140 likes | 233 Views
An Analysis and Survey of the Development of Mutation Testing by Yue Jia and Mark Harmon. A Quick Summary For SWE6673. Introductory Comments.
E N D
An Analysis and Survey of the Developmentof Mutation TestingbyYueJia and Mark Harmon A Quick Summary For SWE6673
Introductory Comments • Mutation testing is a fault-based technique (testing to show existence or absence of specific faults)of developing “mutants” to be tested by a set of test cases. • The type of faults are mostly ---- syntax based faults • Test case is ran against several mutant programs. The result is kept and a “Mutation Adequacy Score” is kept: Mutation Adequacy Score = (# of defect found)/( total # of seeded defects) or = (# of mutants killed)/(# of seeded non-equivalent mutants) ≤ 1 • In a sense, Mutation Test is evaluating how good is the set of test cases • An area that was started in 1970’s by Lipton, DeMillo & Hamlet
Mutation Testing (Theory & Process) • 2 Underpinning Hypotheses (assumptions): • Programmers are competent and make simple errors • Errors and faults (defects) are coupled – detection of simple faults lead to detection of many complex faults • Mutation Analysis & Execution Process: • For a program P, generate (develop) mutation P’ of P • Run the original program P with test cases T, fixing all the bugs in P. • Run the mutation program P’ with test cases T • Consider P’ “killed” (mutation detected) if results of running T against P and P’ are different • Continue running all the P’s and score the “killed” P’s versus all the developed P’s----- this “score” gives us the Mutation Adequacy Score. • But --- Mutation Testing has Problems/Weaknesses: • There may be a “high number” of mutants and the cost of running them all. • High amount of “human effort” required in the analysis of mutants is costly.
Mutation Test Case Problem: Mutants Reduction • Given a set of Mutants, M and a set of test cases T • let MST(M) stand for mutation score of M with T • Then the problem is to find a subset of mutants M’ from M where: MST(M) ≈ MST(M’) or (# of killed M/non-equivalent M)≈ (# of killed M’/non-equivalent M’) What do you think ---? Is this really the same?
Mutation Test Case Problem: Mutants Reduction(cont.) • Many different approaches have been tried to reduce the number of mutants for testing: • Random Sampling : found that randomly selecting 10% of mutants only reduces the effectiveness 16% • Mutation clustering: grouping mutants into clusters by traditional clustering techniques such as K-means and selecting representative mutants from each cluster • Selective Mutation: reducing the number of mutation operators to reduce the number of mutants generated without significant loss of effectiveness. • High Order Mutation: use higher order mutants (via multiple applications of mutation operators) to replace number of first order mutants
Techniques for Reducing Execution (Running) time and effort • Consider Strong, Weak, and Firm ways to analyze the killing of mutants during the execution process: • Strong: execute the whole program and if the mutant results differently, then it is “killed” • Weak: execute up to and include the part that is mutant and check the result • Firm: execute somewhere between Strong and Weak
Techniques for Reducing Execution (Running) time and effort (cont.) • Use different tools : • interpreter (expensive) • compiler (fast) • Compiler based (compile the whole original program P, but only compile the mutation part for each program from M • byte code translation to different platforms (for platform testing) • aspect oriented mutation • generate & compile different mutants • Use multiple platform such as distributed processors to simultaneously execute the mutants
Detection of “Equivalent Mutant” Problem • Detecting that a program P and one of its mutant is equivalent is theoretically undecidable: Program P “Equivalent” Mutant P’ for (inti=0; i<5; i++) for (inti=0; i != 5; i++) { no change to { no change to value of i within loop value of i within loop } } • P’ is a mutant of program P if P’ is syntactically different from P but is functionally same as P (e.g. produces same results). • Mutation Score based on non-equivalent mutants without complete detection of “equivalent” mutants implies that we can never get to 100% on mutation score. How much does this bother you ---?
Mutation Testing May be Applied: to Various Artifacts • To Program Source Code in following languages : • FROTRAN (22 mutant operators) • ADA • C (77 mutant operators) • JAVA (include OO mutant operators) • C++; SQL; PHP; AOP; COBOL; Spreadsheet Language • To Specifications in : • Finite State Machine • Star Chart • Petri Net • Web services in XML • Description of runtime environments
Mutation Testing May also be Applied to ---- • Constraint-based test data generation (found that 75% of the mutants can be killed with automatically generated test data.) [e.g. conditions in which mutants will die are written as algebraic constraints on test cases and then generate the test cases] • Regression testing where we reuse the test data: • Reschedule the test sequence based on the mutant killing score • Minimize the test cases that need to be re-run for regression based on the mutant killing score of the test cases
Some Empirical Results • Most of the earlier work dealt with code size of < 50 loc and increased in size as non-academic code was used • Mutation testing developed test cases, when compared against “all-use” data flow test cases, actually subsumes the all use-test cases. (16% more defects were found with mutation generated test cases than all-use data flow) • “Real world” software errors were compared against mutants, and found 85% of mutants were also “real world” faults. • Human errors and mutants are different enough that both automated and human generated faults are needed for testing
Some Concluding Remarks • Tools for Mutation Testing have increased in number since 2000. (30 years after its inception ---- technology maturation takes that long) • Continued Barrier: • Perception of complexity and high cost • Decreasing (can not be solved)Equivalent Mutants problem is gaining some momentum Does thinking about Mutants provide us insight into defects and help us generate better test cases? ----- Your thoughts?
Now that you have read about Mutation Testing --- consider the following pseudo – code Input pairs as test cases Integer m, n, max ; {5,2 } max = 5 Input m, n; {10, 305} max = 305 If (m ≥ n) {20, 20} max = 20 max = m; {-23, -5} max = -5 else { 0, -4} max = 0 max = n; print max; • Is this test set pretty good --- what mutant will not be killed? ----- what type of defect was this?