240 likes | 318 Views
A Generic Method for Statistical Testing. A. Denise, M.-C. Gaudel and S.-D. Gouraud {denise, mcg, gouraud}@lri.fr L.R.I, Université Paris XI, 91400 ORSAY, FRANCE. Outline. Context Statistical testing and test quality Combinatorial Structures Our new approach of Statistical Testing
E N D
A Generic Method for Statistical Testing A. Denise, M.-C. Gaudel and S.-D. Gouraud {denise, mcg, gouraud}@lri.fr L.R.I, Université Paris XI, 91400 ORSAY, FRANCE ISSRE 2004
Outline • Context • Statistical testing and test quality • Combinatorial Structures • Our new approach of Statistical Testing • Draw paths • Optimise test quality • Validation of our approach • Application to structural statistical testing • Experimental results • Conclusion and Futures
Combinatorial structures specification • The specification of a class of Combinatorial Structures is a set of production made from • basic objects: ε and Atom resp. of size 0 and 1 • constructions: union(+), product(x), sequence, etc. • cardinality constraints • Example: • Complete (non empty) binary tree: T= L+ TxT where L is an Atom that represents some Leaf • Complexity of counting and generating • Linear in our case • n log n in general for combinatorial structures of size n
Statistical (or random) Testing • Selecting test data uniformly (or based on an operational profile) at random from the input domain of the program • It is possible to test more intensively than with the other methods • Bad coverage of particular cases like exception cases • Solution? A combination with another testing method [Thévenod-Fosse,Waeselynck, LAAS,1991].
Quality of statistical testing [TF,Wa] Let E the set of elements to be covered N the number of tests The test quality qN is the weakest probability that any element of E has to be covered when N tests are exercised qN =1-(1- pmin)N where pmin = min{p(e), eE} • To maximise qN, we need to maximise pmin • A solution (not always possible): Uniform drawing among E
Our approach of Statistical Testing Random drawing of paths • The set of paths of a graph can be easily represented by a combinatorial structure specification • Random generation with a linear complexity • 2 steps: 1) Draw an adequate set of paths 2) Find input data which ensure the execution of these paths (of length ≤ n)
v INIT v I0 e0 C1 e5 e1 e3 I2 B3 I4 e4 e6 e2 I5 e7 EXIT Graph and Combinatorial Structures Atoms= edges Sequence of edges= paths S= v.S + v.e0.C.e7 C= e1.e2 + e3.B.e6 B= e4.I + ε I= e5.B S C
Generation: counting From S, there are 3 paths of length 7
v INIT v I0 e0 C1 e5 e1 e3 I2 B3 I4 e4 e6 e2 I5 e7 EXIT Generation: drawing S= v.S + v.e0.C.e7 C= e1.e2 + e3.B.e6 B= e4.I + ε I= e5.B S7 ? 1/3 ? 2/3 vS6 ve0C4e7 0 1 1 vvS5 ve0e3B2e6e7 vve0C3e7 0 1 vvvS3 vvve0C2e7 1/2 1/2 vvve0e1e2e7 vvve0e3e6e7 ve0e3e4e5e6e7 Length=7
Combinatorial Structures & Statistical Testing • If the criterion consists in covering a set of paths: the corresponding combinatorial structure specification is built Examples: all paths passing through the edge a, all paths passing through the node B3 then the node I4 … • If the criterion consists in covering a set of elements: ??? Examples: all nodes, all edges… How a uniform drawing among paths can ensure a good test quality for the coverage of elements of the criterion?
Drawing paths Let be N the number of tests, we want to: • Pick, with a suitable distribution, N elements e1,…,eN among the elements to be covered • For each ei, draw uniformly a path (of length ≤ n) among those which pass through this element ei.
INIT I0 C1 I2 B3 I4 I5 EXIT Example: all red nodes • E={I2,I0,I4,I5} • 5 paths of length 11. • Uniform distribution p(I2)= 1/4 +1/41/5 +1/40 +1/41/5 =7/20 = 0.35 And: p(I4)=11/20, p(I0)=1, p(I5)=1 pmin=p(I2)= 0.35 pmin is not optimal!
INIT I0 C1 I2 B3 I4 I5 EXIT Example: all red nodes • E={I2,I0,I4,I5} • 5 paths of length 11. • Distribution • p1(I0)=p1(I5)=0 • p1(I2)=p1(I4)=0.5 • pmin=0.5 How could we automatically maximisepmin?
Probability of an element The probability p(e) of the element e to be exercisedby one run is: p(e)=p1(e)+p2(e) • Probability to draw this element (step 1): p1(e) • Probability to draw a path passing through this element (step 2): where e’ is the element drawn in step 1, c(e’) is the number of paths passing through e’ c(e,e’) is the number of paths passing through e and e’
A method to calculate the distribution To optimise the test quality, we have to maximise pmin. However for all e in E,
Maximise pmin under these constraints This optimisation problem is solved by a Simplex and the p1(ei) are deduced. Spmin=
INIT I0 C1 I2 B3 I4 I5 EXIT Example: all red nodes • Uniform distribution on paths • 5 paths of length 11 p1(I2)=p1(I4)=0.5 p1(I0)=p1(I5)=0 pmin=0.5
From paths to Input Data Find the input data that will cause the execution of each drawn paths • In case of finite models: FSA, FSM • Input data= sequence of inputs labelling the edges + additional inputs (for observation)
From paths to Input Data • General case of infinite models: any description including non-trivial data types and guards like EFSM, state-charts, CFG • Build the path predicates • Solve them (semi-decidable problem) • Input data = any data satisfying the predicate
Experimental validation • Application to Structural Statistical testing • Prototype AuGuSTe • Same set of programs and mutants as Thévenod-Fosse, Waeselynck and Crouzet • Compare the detection power of our approach (drawing paths and solving the predicates) with theirs (building the input distribution explicitly) • Experimental results are quite similar
Experiences with qN=0.9999 • Fct4: • more than 1016 paths of length ≤ 234 • long predicates
Mutation scores • More than 10000 runs performed on 2914 mutants • Fct3: non independence of the test experiments
Conclusion • New application of some results in combinatorics to testing A generic and automated statistical testing method • Good experimental results for structural statistical testing • Quite similar than [TFWaCr] • It is very likely that the method scales up well
Future Work • Combinatorial structures tools can deal with more complex languages • More elaborated combinatorial structures Elimination of some major source of unfeasible paths • Random generation based on Boltzmann models [DuFlLoSc] The bound of the length of considered paths can be avoided