310 likes | 326 Views
Learn about mutation testing, creating mutants, testing processes, and detecting equivalent mutants automatically. Dive into statistics on mutant types and distribution in programs.
E N D
Automatically Detecting Equivalent Mutants and Infeasible Paths By Jefferson Offutt and Jie Pan Lectured by Oren Matza
Introduction Mutation testing is a technique proposed by DeMillo et al. (1978) and Hamlet (1977). The idea is to take program that behave “good” on a test case, change it (hence the term mutant) and cause this faulty program to result a failure. The goal is to make a lot of mutants, from the original program, run it and get different results from the original program. Different behavior of the mutant consider to be fault and then the mutant is consider to be “killed” and “dead” so it will not remain in the testing system. The test case that was run is called “efficient” test case and it save in the testing system.
5 mutants FUNCTION Min (I, J :Integer) RETURN Integer IS MinVal : Integer; Begin MinVal := I; MinVal := J; 1 If (J < I) THEN If (J > I) THEN 2 If (J < MinVal) THEN 3 MinVal := j; TRAP 4 MinVal := I; 5 END IF; RETURN (MinVal);End Min Example: FUNCTION Min (I, J :Integer) RETURN Integer IS MinVal : Integer; Begin MinVal := I; If (J < I) THEN MinVal := j; END IF; RETURN (MinVal); End Min 1,3,5 changing operands 2 changing operator 4 sentence insert/change
The process of testing Automatic creation of mutants (types seen in the example) Automatic / Manually creation of test cases Run Original program on one of the test cases. If failed, fix the bug and Run again Else Run test case with each “live” mutant If result different from the result of the original program, it consider incorrect and the mutant is “killed”. (the test case is kept as effective one) At the end 2 types of mutant remains: 1. Mutants which are “killable” but the domain of the test cases is not wide enough to kill them. 2. Mutants which are equivalent in its behavior to the original program i.e . Equivalent Mutants The testing process can’t finished until all the mutants are killed. Detecting which of the programs is EM was done manually. Doing this is: time consuming difficult to see in glance which is EM and which is not, which lead to incorrect marking of EM as non-EM and vice versa. (Acree 1980) Because of this reason a considerable effort should be done to find out which are the EM in the mutants domain
Mutants Automatic Creation The Mothra mutation system (DeMillo et al 1988) uses 22 mutation operators to test Fortran 77 programs. Mutation Operator Description AAR array reference for array reference replacement ABS absolute value insertion ACR array reference for constant replacement AOR arithmetic operator replacement ASR array reference for scalar variable replacement CAR constant for array reference replacement CNR comparable array name replacement CRP constant replacement CSR constant for scalar variable replacement DER DO statement END replacement DSA DATA statement alterations GLR GOTO label replacement LCR logical connector replacement ROR relational operator replacement RSR RETURN statement replacement SAN statement analysis SAR scalar variable for array reference replacement SCR scalar for constant replacement SDL statement deletion SRC source constant replacement SVR scalar variable replacement UOI unary operator insertion
Distribution of EM In previous works and also in the research done in this particular work, it was found that EM are not evenly distribute among mutant types. They tend to cluster around only few types. The next table summarizes statistics from 11 programs used in this work. Mutant Type % of equivalent % of all mutants ABS 47.19 4.30 ACR 14.10 1.28 SCR 7.05 0.64 UOI 6.04 0.55 SRC 4.89 0.45 SVR 4.46 0.41 ROR 3.60 0.33 SDL 2.16 0.20 CRP 1.58 0.14 AAR 1.44 0.13 RSR 1.44 0.13 LCR 1.15 0.10 ASR 1.15 0.10 CSR 1.01 0.09 SAR 1.01 0.09 All others 1.73 0.16 Total 100 9.10
Because ABS mutant as many more equivalent mutants then any anther type. It can be divided to 3 unary operators: ABS -- compute the absolute value of the expression NEGABS -- compute the negative of the absolute value ZPUSH -- kills the mutant if the expression is zero (then the mutant is really EM) (this force the tester to cause the expression to be zero - a common testing heuristic) Do procedures for automatically detecting EM exist ? Budd and Angluin (1982) showd that the problem of determining if two programs are equivalent is undecidable . Mutant and the original program, are not just two arbitrary programs, and they are very much syntactically similar, but Budd and Angluin showed also that this problem is undecidable . Still because EM distribution and because of the fact that they are different from the original program only in one sentence, in many cases (statistically) it can be determine that this mutant is EM.
Previous work Buldwin and Sayward (1979) describe 6 types of compiler optimization techniques that can be used to identified EM. The motivation to use compiler optimization techniques was: programs after optimization are mutants of the original program mutants can be optimization / de optimization of the original program In 1994 Offutt and craft designed algorithms for those 6 techniques built a tool and succeeded to find 10% of the EM for 15 programs. In his Ph.D (1988) Offutt presented a technique called Constraint-Based-Testing (CBT) to use mathematical constraints for testing. Demillo and Offutt (1991) presented how constraints can be used to generate test cases to satisfy mutation testing but did not give details how to do it. This work develop the idea, gives strategies and algorithms for detecting EM and show some results from implementing those algorithms.
Feasible path problem The original problem of FPA (feasible path analysis) defined like this: given a description o a set of control flow paths through a procedure, feasible test analysis determines if there is input data that causes execution to flow down some path. The generalized feasible path problem (FTP) is: given a requirement for a test case, the feasible path problem determines if there is inpt data that can satisfy the requirement (constraints). This problem is undecidable (Goldberg et al 1994, De Millo and Offutt 1991). This work will focus in determine feasibility with a heuristic-based set of transformations, thus determine equivalency of mutants. Using Constraints to detect EM In our context, constraint is a mathematical expression that restrict the input space of the program to be the portion of the space that satisfy a certain property. For example the constraint (x > 0) restrict the input space only to positive inputs. (complex constraints can be used to restrict the input space only to inputs that represent rectangle or sorted array) we will use CBT to define constraints that represent condition in which the mutant is killed. If there is an input that satisfy the constraints then it means that the mutant is not equivalent and the mutant is killed. If there is not such an input (infeasible input) then this is EM.
The CBT technique We will mark: P = Program M = Mutant of P S = statement TC = test case. The state of the program is the values of all data items and program counter. To kill M, TC has to have those 3 characteristics: Reachability execute the mutated statement. If will not execute it, defiantly it will not kill M. (Cr) Necessity it must be able to cause M to have an incorrect state if it reaches the mutated statement if S is in a loop, the necessity condition must be hold after each iteration. (the necessity constraint requires that two predicates/expressions will evaluate to different results. (Cn) Sufficiency the final state of M is different from the final state of P. (Cs) (Cn is necessary but not sufficient. Cs is iff) Let D represent the entire domain of all TC for P. D can be divided in several ways, for each mutant: D = Dr Dr D = Dn Dn D = Ds Ds
Some facts Fact 1 - TC is an effective test case that will kill M TC Ds for M (trivial) Fact 2 - If TC is an effective test case that will kill, M then TC Dr Dn (That means that there are TC who satisfy Cr and also Cn but not Cs - example latter) Fact 3 - Ds Dr Dn Unfortunately finding TC such that TC Dr is an undecidable problem. This is because the determining whether TC executes S is reducible to the halting problem. Thus a weaker condition is defined. CR is defined such that if S is executed, then CR 1is true. Since Cr CR then the following fact is clear: Fact 4 - Dr DR DR Dn Dr Ds CBT uses path expressions to describe reachability condition (the weaker condition) , CR , for a statement. A path expression for a statement S in a P is an algebraic expression that describes a condition on test cases that will be true when P reach S. Path expressions usually describe multiple paths to S by using a disjunctive formula, where each clause represent a separate path. Path expressions are automatically derived from the program by extracting the predicate expressions on the program’s control flow graph.
Example FUNCTION Mid (X, Y, Z : Integer) RETURN Integer IS MidVal :integer; BEGIN MidVal := Z; IF (Y < Z) THEN IF (X < Y) THEN MidVal : = Y; ELSE IF (X < Z) THEN ELSE IF (X<=Z) THEN 1 MidVal := X; END IF ELSE IF (X > Y) THEN MidVal : = Y; ELSE IF (X > Z) THEN MidVal := X; END IF RERURN (MidVal); END When x = = z Cn is true because (x < z) != (x<=z) but Cs is not satisfied ( P return Z, M return X which are equal) Mutant is not killed.
Formalization of what we saw until now Constraints and detecting EM P a program, M mutant of P. P(TC) and M(TC) are the Outputs of P and M on TC Definition M is an EM of P P(TC) = M(TC) for every TC D This says that if a mutant is functionally equivalent to the original program, it is impossible to find ant test case to kill the mutant. (TC|TC D P(TC) M(TC)) TC|TC D P(TC) = M(TC)) This leads to to the following theorems: THEOREM 1 Dr = f (Cr is infeasible) M is EM Proof (1) M is equivalent Ds = f --Definition, Fact 1 (2) Ds Dr Dn --Fact 3 (3) Dr = f Ds = f -- Rules of set, (2) (4) Dr = f M is equivalent --Substitutiof (1) in (3)
THEOREM 2 Dn = f (Cn is infeasible) M is EM Proof (1) M is equivalent Ds = f --Definition, Fact 1 (2) Ds Dr Dn --Fact 3 (3) Dn = f Ds = f -- Rules of set, (2) (4) Dn = f M is equivalent --Substitutiof (1) in (3) THEOREM 3 Dr Dn = f (Cr Cn is infeasible) M is EM Proof (1) M is equivalent Ds = f --Definition, Fact 1 (2) Ds Dr Dn --Fact 3 (3) Dr Dn= f M is equivalent --Substitutiof (1) in (2) from the fact that CrCR the following claims could be derived: a) DR = f M is EM b) DR Dn = f M is EM
All the above leads to the following conclusions: (a) If a path expression constraint system (CR) for a mutated statement, of M , is infeasible, then the set of test cases (DR) that can kill M is empty - implying M is never killed. So M is equivalent. (b) If a necessity constraint system (Cn) for a mutant M , is infeasible, then the set of test cases (Dn) that can kill M is empty - implying M is never killed. So M is equivalent. (c) If a constraint system which is a conjunction of CR and Cn , is infeasible, then the set of test cases (Dn Dn) that can kill M is empty - implying M is never killed. So M is equivalent. This means that to decide if a constraint system is infeasible, there must be contradiction in the constraint system itself. (for example the constraint system (X>0) (X< 0) ) If M as a constraint system like in the example then it is EM. So far we saw how we can translate the problem of detecting an EM to a problem of finding contradictions in mathematical constraint system. Also now test case generation uses constraints and EM detection uses constraints. Representation of the constraints The expressions composed of variables and operators from the programming language of P and comes from the right hand side of assignments and decision statements of P itself. It evaluate to true or false. A clause is a list of constraints connected with logical AND and OR. A conjunctive clause uses only AND. All constraints kept in disjunctive normal form (DNF) which is a list of conjunctive clause connected only by OR’s. DNF formula referred to as a constraint system, in which each conjunctive clause represent path expression to a statement. During constraint satisfaction only one clause need to be satisfied.
Note: we said that constraint includes variables from the program. Unfortunately this includes “internal” variables. For test case generation a symbolic evaluation (King 1976; Offutt 1991) is used to rewrite variables to be in terms of input variables. Finally what we all been waiting for: The techniques to find the EM Because it is undecidable problem, it can’t be solve algorithmically, but because EM are currently detecting manually, even a partial solution is valuable. There are some off-the-shelf theories to the infeasible-constraint problem, but we will not use it because: (1) such a theorem gives much more then we need, (2) it is difficult to integrate this into already-existing software, for testing this work. Because EM divided into more common an d less common type, and because they are different from the original program in a well-defined way, we can use special techniques to deal with this cases. 3 techniques will be showed: Negation, Constraint splitting, Constant comparison.
Negation Definition 1 constraint C1 is the negation of C2 the domains they describe: (a) non-overlap (b) cover entire domain of variables in C1 and C2 Definition 2 constraint C1 is a partial negation of C2 the domains they describe: (a) non-overlap (b) do not cover entire domain Definition 3 two constraints are semantically equal if they describe the same domain Definition 4 two constraints are syntactically equal if they describe the same domain, and also have the same string of symbols. (clearly two syntactically equal constrains are also semantically equal ) Examples (1) A is the constraint x > 1, B the constraint x <= 1. A is negation of B and B is negation of A (2) A is the constraint x > 1, B the constraint x < 1. A is partial negation of B and B is partial negation of A (3) A is the constraint x > 0, B the constraint x > 0. They are syntactically equal, so also semantically equal (4) A is the constraint x > 0, B the constraint x >=1 (x integer). Then they are semantically equal but not syntactically equal
The negation technique is the basic technique to recognize infeasible constraints. Just negate one of the constraint and see if now they are syntactically equal. If so, the constraint are conflict (and the mutant is EM) For example A is (x+y) > z, and B is (x+y) <= z The following table show how to negate / partial negate a constraint partial negation of C constraint CNegation of C Partial negation 1 Partial negation 2 exp1 > exp2 exp1<=exp2 exp1< xp2 exp1=exp2 exp1 >= exp2 exp1 < exp2 -- -- exp1 < exp2 exp1>=exp2 exp1 > exp2 exp1 = exp2 exp1 <= exp2 exp1 > exp2 -- -- exp1 = exp2 exp1 exp2 exp1 > exp2 exp1<exp2 exp1 exp2 exp1 = exp2 -- -- true false -- -- false true -- -- What about an algorithm ?
Algorithm: Negation (A, B) Precondition: A and B are properly initialize constraints. Postcondition: Returns conflict if A and B conflict, no-conflict otherwise begin neg-A = Negate (A) --use the table showed if (neg-A syntactically equal B) return conflict else if (the relatioon operator in A is one of {{>,<,=} ) partial1-A = PartialNegate1 (A) --use the table showed if (partial1-A syntactically equal B) return conflict else partial2-A = PartialNegate2 (A) --use the table showed if (partial2-A syntactically equal B) return conflict else return no-conflict end-if end-if end-if end-if end Negation
Constraint Splitting This technique is also used to recognize infeasible constraints. If C and D are two constraints and one of the constrains is of the form (V1 AOP V2) ROP K), then we can split this constraint. Suppose C has this form. We will split C to two new constraints A and B, such that C A B. It will be shown that if A B conflict with D then also the original C conflict with D. If so, the constraint are conflict (and the mutant is EM) Note: usually A and B are weaker then C, but it easier to decide if someone of them conflict with D. Proof: C A B C ( A B ) --implication (A B ) C --commutativity ( A B ) C --negation ( A B ) C --De Morgan A B C --commutativity now we assume that And B conflict with D, so: (1) A B D --assumption (2) A B -- (1), AND property (3) A B C --assumption (4) C --implication eliminationn, 2, 3 (5) D -- (1), AND property (6) C D -- (1),(4),(5) AND property
The following table show how to split constraint C to to constraint A and B (such that C A B) Original Constraint New Constraint 1 New Constraint 2 (x+y) > 0 x > 0 y > 0 (x+y) 0 x 0 y 0 (x+y) < 0 x < 0 y < 0 (x+y) 0 x 0 y 0 (x+y) = 0 x 0 y 0 (x+y) 0 x -y (x -y) > 0 x > 0 y < 0 (x -y) 0 x 0 y 0 (x-y) < 0 x < 0 y > 0 (x -y) 0 x 0 y 0 (x-y) = 0 x 0 y 0 (x -y) 0 x y (x *y) > 0 x > 0 y > 0 x < 0 y < 0 (x *y) 0 x 0 y 0 x 0 y 0 (x *y) < 0 x > 0 y < 0 x < 0 y > 0 (x *y) 0 x 0 y 0 x 0 y 0 (x *y) = 0 x = 0 y = 0 (x *y) 0 x 0 y 0 (x / y) > 0 x > 0 y > 0 x < 0 y < 0 (x / y) 0 x 0 y > 0 x 0 y < 0 (x / y) 0 x 0 y < 0 x 0 y > 0 (x / y) < 0 x > 0 y < 0 x < 0 y > 0 (x / y) = 0 x = 0 (x / y) 0 x 0
The Algorithm Algorithm Splitting Constraints (NecConst, PEConst) Precondition: NecConst and PEConst are properly initialized constraints Postcondition: Returns conflict if NecConst and PEConst conflict, no-conflict otherwise. begin --V1 and V2 are variables, K is a constant, aop is aritmetic operator, rop is relation operator. if (the format of NecConst is not ((V1 aop V2) rop K)) return no-conflict else --use table to split NecConst A = NewConstraint1 (NecConst) B = NewConstraint2 (NecConst) endif if (Negation (A, PEConst)==conflict) AND (Negation (B, PEConst)==conflict) return conflict else if (CompareConstraints (A, PEConst)) AND CompareConstraints (B, PEConst))) return conflict else return no-conflict end if end if end
Constant Compression This technique is working when both constraints have the form (v rop k). v must be the same so the constraints are (v rop k1) and (v rop k2). (This strategy also known as grounding). If the constraint has the format (v aop k1) rop k2 we can rewrite it as v rop (k2 aop k1), so it will be in the format we want it to be (aop is the inverse operation of aop). Then the following table can help us determine if the two constrains conflict. Constraint A Constraint B predicate (pred) Conclusion (T for conflict, F for not) x > k1 x > k2 --- F x > k1 x k2 --- F x > k1 x < k2 K1 K2-1 if pred T, else F x > k1 x k2 K1 K2 if pred T, else F x > k1 x = k2 K1 K2 if pred T, else F x > k1 x k2 --- F x k1 x > k2 --- F x k1 x k2 --- F x k1 x < k2 K1 K2 if pred T, else F x k1 x k2 K1 >K2 if pred T, else F x k1 x = k2 K1 >K2 if pred T, else F x k1 x k2 --- F
x < k1 x > k2 K1 K2+1 if pred T, else F x < k1 x k2 K1 K2 if pred T, else F x < k1 x < k2 --- F x < k1 x k2 --- F x < k1 x = k2 K1 K2 if pred T, else F x < k1 x k2 --- F x k1 x > k2 K1 K2 if pred T, else F x k1 x k2 K1 <K2 if pred T, else F x k1 x < k2 --- F x k1 x k2 --- F x k1 x = k2 K1 <K2 if pred T, else F x k1 x k2 --- F x = k1 x > k2 K1 K2 if pred T, else F x = k1 x k2 K1 <K2 if pred T, else F x = k1 x < k2 K1 K2 if pred T, else F x = k1 x k2 K1 >K2 if pred T, else F x = k1 x = k2 K1 K2 if pred T, else F x = k1 x k2 K1 =K2 if pred T, else F x k1 x > k2 --- F x k1 x k2 --- F x k1 x < k2 --- F x k1 x k2 --- F x k1 x = k2 K1 =K2 if pred T, else F x k1 x k2 --- F
Algorithm algorithm: CompareConstants (A, B) Precondition: A and B are properly initialize constraints. Postcondition: Returns conflict if A and B conflict, no-conflict otherwise begin --V is a variable, k,k1,k2 are constants, rop is relational operator, aop is arithmetic operator if (the format of A is (V rop K)) keep the format the same else if (the format of A is (K rop V)) modify format to (V rop K)) else if (the format of A is ((V aop K1) rop K2)) modify format to (V rop (K2 aop K1)) else if (the format of A is (K1 rop (V aop K2))) modify format to (V rop (K1 aop K2)) else return no-conflict end if if (the format of B is (V rop K)) keep the format the same else if (the format of B is (K rop V)) modify format to (V rop K)) else if (the format of B is ((V aop K1) rop K2)) modify format to (V rop (K2 aop K1)) else if (the format of B is (K1 rop (V aop K2))) modify format to (V rop (K1 aop K2)) else return no-conflict end if
if (the v’s in A and B are not the same) return no-conflict end if if (ConstantCompression ( A, B) == true) -- see the table before return conflict else return no-conflict end if end CompareConstants A proof of concept tool To test the techniques developed above, a tool call Equivalencer was created. Equivalencer is: integrated with Godzilla - a test data generator. inserted to the Mothra mutation tool set. implemented in C work on Sun Sparc workstation running SunOs 4.1.3 contain more then 2000 executable lines of code uses some of the Mothra and Godzilla libraries. It implemented inside the 3 strategies for detecting EM. First it apply Negation, if no conflict Constant comparison, if no conflict Constraint Splitting. Assertions In Equivalencer, assertions are constraints that the user insert into the test program , to restrict the input domain of some variables manually. Assertions on Parameter variables are precondition and derived by hand from the specifications. Assertions on Internal variables can derived automatically - slicing (Weiser 1984) and control flow analysis (Fischer and Leblanc 1988)
Godzilla generate constraints related to array it does not take into account the array index expression. This means that A(i) >0 and A(j) < 0 will generate A() > 0 and A() <0 which leads to conflict. To avoid this Equivalencer will check array constraints , only if it is assertion constraint which is related to all the array. This will be called array-extension. Equivalencer Design 1. Initialization - open all files that it needs and load data to memory. 2. Consult failure information of the Godzilla about simple EM cases. “exit” with conflict if found. 3. Gets from Godzilla the path expressions and combined it with assertions. Check with all 3 techniques. “exit” with conflict if found. 4. Take necessity constraints and check with all 3 techniques. “exit” with conflict if found. 5. Combine path expressions and necessity constraints and check with all 3 techniques. “exit” with conflict if found. 6. Combine necessity constraints and array-extension and check with all 3 techniques. “exit” with conflict if found. 7. Combine path expressions and array-extension and check with all 3 techniques. “exit” with conflict if found. 8. Exit with no-conflict (no EM). Note about efficiency as a proof -of-concept tool it was not meant to be efficient, so every path-expression checked against every necessity constraint. This increase time of execution.
Empirical results Program Statements Mutants Equivalents Equi. Detected Percent. Detected Bsearch 20 299 27 19 70.37 Bub 11 338 35 24 68.57 Cal 29 3010 236 37 15.67 Euclid 11 196 24 18 75.00 Find 28 1022 75 63 84.00 Insert 14 460 46 32 69.57 Mid 16 183 13 3 23.08 Pat 17 513 61 29 47.54 Quad 10 359 31 4 12.90 Trityp 28 951 109 80 73.39 Warshall 11 305 35 22 62.86 TotalAvg 185 7636 695 331 47.63 Discussion about the results Although the total is less then 50%, the percentage are dramatically change between the programs (84% and 12.9%) This is because some factors, but maybe most of all it is the limitation of the tools the Equivalencer is based upon. First is the problem with array reference discussed before. If constraint such as A(4) = 0 could be taken into account, then the mutant could have been killed if A(4) was not 0. Also Godzilla associate every variable in each statement with statement number, which propagate, if the value of the variable is not change, to the next statement. This make checking the constraint complicated, specially because Godzilla has a bug in it.
Feasible Path results Although this research was focused on EM, the same technique can be used on the problem of infeasible paths. Specifically result of Theorem 1 is that if the reachability condition for a sentence is infeasible then the sentence is unreachable. To do something based on this, some programs were artificially created, such that some of the sentences were unreachable, and the Equivalencer have been tried on it. (A mutation operator was defined such as S is unreachable M is EM) The results are shown in the table below: Program unreachable detected percentage detected prog 1 2 1 50.00 prog 2 1 0 0.00 prog 3 1 1 100.00 prog 4 1 1 100.00 prog 5 1 1 100.00 prog 6 1 1 100.00 prog 7 2 2 100.00 prog 8 3 3 100.00 prog 9 2 0 0.00 Total Avg. 14 10 71.43
Improving the software The Equivalencer is very heavily rely on Godzilla. Godzilla implement symbolic evaluation as separate step from infeasible constraint and throw away considerable information that the Equivalencer needs, for example reference to internal variables. Godzilla generate array constraints without indexes. If the indexes could also be take, analysis of the CAL program found that Equivalencer could find 69 more EM thus increasing detection from 15.67% to 44.92%. Give the tester opportunity to help in recognizing difficult constraints. It will be more easier for him then to find EM, and will help the tool to find EM This software analyze program only until the mutated statement, but the analysis could be done further on, until the end of the program. Conclusions A partial desolation to the EM problem was shown here. An algorithms were given, and a proof-of-concept tool was built. It was shown that this was effective partial solution. The technique is general and could be used to every feasible problem. Compare to finding EM by hand, even this non optimized tool was fast. This type of system allow the programmer to submit a software module, and after few minutes of commutations come up with a set of test cases as inputs and a set of outputs to be examined to find failure in the software - inputs and outputs that can be latter use for debug when a failure is found.