340 likes | 457 Views
Testing Procedures (You MUST bring Binder chapter 10). We don’t really test a class do we? We can test each method of a class: method scope reuses experience in procedural testing And we can test sequences of methods of that class we will be using control flow graphs
E N D
Testing Procedures (You MUST bring Binder chapter 10)
We don’t really test a class do we? We can test each method of a class: method scope reuses experience in procedural testing And we can test sequences of methods of that class we will be using control flow graphs And we can consider inheritance by testing the flattened statechart of a class Why not just test instances in context? In other words, why not just scenario-based testing? See 5 reasons p.350, esp. 2nd one Bottom line: a contextual testing must precede contextual testing, but the question remains as to how much is worth investing in a contextual testing. There are different ways of performing class testing: 4 bullets p.353 Class Testing?
Control (or code) Coverage
Definition: (p.357) A code coverage model calls out the parts of an implementation that must be exercised to satisfy an implementation-based test model. Coverage, as a metric, is the percentage of these parts exercised by a test suite. Branch coverage is said to subsume state coverage: Exercising all branches necessarily exercises all statements A subsumption hierarchy is an analytical ranking of coverage strategies. No generalizable results about relative bug-finding effectiveness has been established… For Binder, a code coverage model is NOT a test model: Test from responsibility-based models and then use coverage reports to analyze the adequacy of a test suite. Coverage models typically ignore deferred methods, class scope (focusing on method scope instead) and inheritance. A First Look at Code Coverage
Code in fig 10.2 p.363 Corresponding Control Flow Graph (CFG) in fig 10.3 p.364 A segment is 1 or more lexically contiguous statements with no conditionally executed statements. That is, once a segment is entered, all statements in the segment will execute. The last statement in a segment must be a predicate, or a loop control, a break, a goto, a method exit. Corresponding paths in 10.4 Dealing with If statements: fig 10.5 Dealing with switch statements: fig 10.6 Watch out for fall-through behavior… In fig. 10.7, thinking in terms of Boolean expressions rather than individual predicates ignores lazy evaluation and does not considerably simplify the CFG. Introducing CFGs
read(x);read(y) while x y loop if x>y then x += x – y; else y += y – x; end if; end loop; gcd := x; y = 0; Another Example Do this CFG correspond to the code?
Statement coverage typically leads to incompleteness if x < 0 then x := -x; else null; endif z := x; Incompleteness A negative x would result in the coverage of all statements. But not exercising x >= 0 would not cover all cases (implicit code in green italic). And, doing nothing for the case x >= may turn out to be wrong and need to be tested.
if c1 and c2 then st; else sf; end if; Uncovering Hidden Edges if c1 then if c2 then st; else sf; end if; else sf; end if;
Path Coverage Criterion: Select a test suite T such that, by executing P for each test in T, all paths leading from the initial to the final node of P’s control flow graph are traversed In practice, however, the number of path is too large, if not infinite It is key to determine “critical paths” Path Coverage
if x 0 then y := 5; else z := z – x; end if; if z > 1 then z := z / x; else z := 0; end if; Example T1 = {<x=0, z =1>, <x =1, z=3>} Executes all edges but does not show risk of division by 0 T2 = {<x=0, z =3>, <x =1, z=1>} Would find the problem by exercising the remaining possible flows of control through the program fragment T1 T2 -> all paths covered
Is 100% coverage the same as exhaustive testing? Are branch and path coverage the same? Can path coverage be achieved? Is every path in a flow graph testable? Is less than 100% coverage acceptable? Can a trust a test suite without measuring coverage? When can I stop testing? FAQ about Coverage
C1 coverage (i.e. line coverage) is not enough: Typically does not exercise all true/false combinations for the predicates: entails may miss errors in predicate Common loop bugs are missed if loop is iterated over once C2 coverage (i.e. branch coverage): Every path from a node is executed at least once Unfortunately treats a compound predicate as a single statement (if n clauses, 2n combinations, but only 2 are tested…) This is a serious oversimplification! See example at top of p.375 C2 can be achieved in D+1 paths with D 2-way branching nodes For 10.7 the paths are at top of p.374 Multiple condition coverage Condition coverage: Evaluate each condition as true and as false at least once… Branch/condition coverage: ditto + each branch at least once Multiple condition coverage: all true-false COMBINATIONS of simple conditions at least once… Subsumes all of the above (and similar to All-variants of ch.6) N.B.: not necessarily achievable (e.g. due to lazy evaluation) Object code coverage (!!): for safety critical applications! Coverage Strategies (1)
McCabe’s basis-path test model: C = e – n + 2 (cyclomatic complexity metric) Supported by most coverage tools Not proven useful, esp. in OO where 90% of methods typically have less than 10 lines of code… Unreliable: see fig 10.9 p.379 The metric is not correlated to the number of entry/exit paths… and collapses compound predicates into a single node. In fig 10.3, C is 5 (using X, Y and Z) and this is less than 1/3 of the expanded entry-exit paths It is possible to select C paths and achieve NEITHER statement NOR branch coverage!!!! Bottom line: CFGs are limited to small members and classes Some authors test only the canonical form of a class Coverage does not address inheritance Coverage Strategies (2)
A loop can be thought of as a path with several special cases! Fig 10.10 summarizes representation of different kinds of loops Fixed sized loops can be translated (unrolled) into a single segment You need at least to test 0, 1, and 2 iterations Fig 10.11: coverage can be hopeless for a method that contains more than 1 loop breaking and nesting are highly problematic! Fig 10.12: tests for ONE loop control variable Test: min, min +1, typical, max –1, max + exclusions Nested loops can be tested according to the strategy of table 10.3 (which proceeds from Procedure 4 p.383) See table 10.3 p.384 Spaghetti loops with several entry or exit points should be rejected!! Dealing with Loops
Data Flow Coverage
Considers how data gets modified in the system and how it can get corrupted Recasted in OO terms, at the method scope, Binder suggests a list of 7 typical errors (p.385) It is unlikely the DUK method catches the last 2 bullets… Three kinds of actions are considered: Define: changes the value of an instance variable Use: C-use: really r-value P-use: used inside a predicate Kill: “kills” instance variable Table 10.4 p.387 : all in/valid D/K/U pairs: really not conclusive!! Example p.386: J and K are interesting! This strategy is to complement control coverage!! (see bottom of p.388) Binder’s recommendation (see quote p.389): all-uses criterion: at least one DU path be exercised for every definition-use pair. Aliasing of variables causes serious problems!! Working things out by hand for anything but small methods is hopeless… Data Flow Coverage
Another Example main() /* find longest line */ { int len; extern int max; extern char save[]; A definition of len Definitions of max max = 0; while ((len = getline ()) > 0) if (len >= max) { A basic block max = len; A c-use of len copy(); } if (max > 0) /* there was a line */ printf("%s", save); A decision predicate involving max } A p-use of max
Corresponding CFG max = 0 while((len = … Definition DEF(v, n) v = … if (len >= .. P-use USE(v,n) v >= … max = len max = len C-use USE(v,n) … = …v … Interestingly enough, the notions of l-value and r-value would be better!! if (max > .. …
void FACTORIAL (int N) int RESULT = 1; begin for int I in 2 .. N loop RESULT = RESULT * I; end loop; return RESULT; end; Factorial Example DU pairs 2 5 2 7 5 5 5 7
Class Control Flow Coverage
“Test design at the class scope must produce method activation sequences to exercise intraclass (ie intermethod) control and data flow.”” The Class Flow Graph is constructed by joining the method flow graph with a state transition diagram for each method! It shows how paths in each method may be followed by paths in other methods, thereby supporting intraclass path analysis. It represents all possible intraclass control flow paths. If no sequential constraint exists for method activation, then the exit node of each method is effectively connected to the entry node of EVERY other method, including its own. Consider figs 10.13 through 10.17 Can apply a coverage model to it. Alpha-omega paths are numerous, even if loops are limited to 0 and 1 iteration! Choose by inspection… The Class Flow Graph
Consider the code p.398 Last line is wrong Need a specific test case to find this! Neither coverage, no all-DU will find this… Marick (p.399) suggests another example: Missing a condition to check when to raise the landing gear Preferably must be in flight ;-) Path sensitization is the process of determining argument and instance variable values that will cause a particular path to be taken: It is undecidable… must be solved heureustically Gets complicated as size increases… Most of the work may have been done through UCMs Know the 6 cases (pp.400-1) of infeasible paths Bottom line: know about equivalence partitioning and choosing test values using boundary value analysis! Coverage and Path Sensitization
Only through domain analysis can you find the required equivalence classes and boundary values!! Fig 10.18 domain fault model Fig 10.19 for function at bottom of p.404 Ignores stack… On, off, in and out points defined p.405, open/closed boundaries p.406 e.g., table 10.5 for our functions Applies to primitive data types. Not classes… Relevant to instance variables and parameters… The input domain of a method considers the flattened preconditions and flattened class invariant Back to 7.16 and 7.17: you have to define the states of an instance if you want to apply boundary value analysis Abstract state on/off/in points pp.408-9 E.g., for a stack in table 10.6 p.409 Basic strategy: one-by-one One on point and one off point for each boundary Rules pp.411-2 Back to the domain matrix… fig 10.21 Boundary Value Analysis
An on point is a value that lies on the boundary An off point is a value not on a boundary An in point is a value that satisfies all boundary conditions and does not lie on a boundary An out point is a value that satisfies NO boundary condition and does not lie on any boundary Definitions
Appendix 1 Formal Definitions for Data Flow Coverage
Generates test data according to the way data is manipulated in the program Helps define intermediary criteria between all-edges testing (possibly too weak) and all-paths testing (often impossible) But needs effective tool support Basic Idea
Node n CFG(P) is a defining node of the variable v V, written as DEF(v,n), iff the value of the variable v is defined in the statement corresponding to node n Node n CFG(P) is a usage node of the variable v V, written as USE(v,n), iff the value of the variable v is used in the statement corresponding to node n A usage node USE(v,n) is a predicate use (denoted as P-Use) iff the statement n is a predicate statement, otherwise USE(v,n) is a computation use (denoted as C-use) Not the same as l-value and r-value unfortunately… Basic Definitions (1)
A definition-use (sub)path with respect to a variable v (denoted du-path) is a path in PATHS(P) such that, for some v V, there are define and usage nodes DEF(v, m) and USE(v, n) such that m and n are respectively the initial and final nodes of the path. A definition-clear (sub)path with respect to a variable v (denoted dc-path) is a definition-use path in PATH(P) with initial and final nodes DEF(v, m) and USE(v, n) such that no other node in the path is a defining node of v. Basic Definitions (2)
The set T satisfies the all-Defs criterion for the program P iff for every variable v V, T contains definition clear paths from every defining node of v to a use of v The set T satisfies the all-Uses criterion for the program P iff for every variable v V, T contains definition clear paths from every defining node of v to every use of v, and to the successor node of each USE(v,n) The set T satisfies the all-P-Uses/Some C-Uses criterion for the program P iff for every variable v V, T contains definition clear paths from every defining node of v to every predicate use of v, and if a definition of v has no P-Uses, there is a definition-clear path to at least one computation use. Formal Definitions (1)
The set T satisfies the all-C-Uses/Some P-Uses criterion for the program P iff for every variable v V, T contains definition-clear paths from every defining node of v to every computation use of v, and if a definition of v has no C-Uses, there is a definition-clear path to at least one predicate use. The set T satisfies the all-DU-Paths criterion for the program P iff for every variable v V, T contains definition-clear paths from every defining node of v to every use of v, and to the successor node of each USE(v,n), and that these paths are either single loops traversals, or they are cycle free. Formal Definitions (2)
Subsumption All paths All definition-use paths All uses All computational/ some predicate uses All predicate/ some computational uses All predicate uses All computational uses Edge (Branch) All definitions Statement
Appendix 2 Another White-Box Approach: Mutant Testing
Basic idea: Take a program and test data generated for that program Create a number of similar programs (mutants), each differing from the original in one small way, i.e., each possessing a fault E.g., replace addition operator by multiplication operator The original data are then run through the mutants If test data detect differences in mutants, then the mutants are said to be dead If they do not, the test data are deemed inadequate and the test data need to be re-examined, possibly augmented to kill the live mutant Mutation Testing (!!)
Hypotheses: Competent programmers: they write programs that are nearly correct Coupling effect: Test data that distinguishes all programs differing from a correct one by only simple errors is so sensitive that it also implicitly distinguishes more complex errors Observations: What about more complex errors, involving several statements? Do you really believe the coupling effect? There is some empirical evidence of these hypotheses!! Underlying Hypotheses