Data Flow Testing (DFT)

Data Flow Testing (DFT) • Data flow testing is NOT the same as constructing Design Diagrams in the form of data-flow-diagrams (DFD) or E-R diagrams. • It is a form of structural testing and mostly White Box testing technique that focuses on program variables and the data paths: • From the point where a variable, v, is defined or assigned a value • To the point where that variable, v, is used Remember, to generate the path for testing we need to set up the data to drive the path.

Static Analysis of Data • Static analysisallows us to check (test or find faults) without running the actual code, and we can apply it to analyzing variables as follows: • A data item(variable) that is defined but never used • A data item that is used but never defined • A data item that is defined a multiple times prior to usage. • While these are dangerous signs, they may or may not lead to defects. • A defined, but never used variable may just be extra stuff • Some compilers will assign an initial value of zero or blank to all undefined variable based on the data type. • Multiple definitions prior to usage may just be bad and wasteful logic • We are more interested in “executing” the code than just static analysis, though.

Data Dependencies and Data Flow Testing(DFT) • In Data Flow Testing (DFT) we are interested in the “dependencies” among data or “relationships” among data ----- Consider a data item, X: • Data Definitions (value assignment) of X: via 1) initialization, 2) input, or 3) some assignment. • Integer X; (compiler initializes X to 0 or it will be “trash”) • X = 3; • Input X; • Data Usage (accessing the value) of X: for 1) computation and assignment (C-Use) or 2) for decision making in a predicate (P-Use) • Z = X + 25; (C-Use) • If ( X > 0 ) then ----- (P-Use)

Data “Dependencies” or Data “Relationships” • There are basically 4 possible combinations of “relationships”between data Definition (D)and data Usage (U). For example for a data item, X: • D-U : relationship between X defined and X used afterwards (**this is the main relationship of concern for Data Flow Testing). • D–D: relationship between X is defined and is redefined with no usage in between ( a case of potential error or in multiple & parallel paths execution of a potential race condition) • U-D: relationship between X is used first and then defined afterward ( a case of potential error) • U-U: relationship between X being used and used again later (there is no impact to X and thus not considered for testing)

Main Steps in Data Flow Testing • The main steps in Data Flow Testing are: • Build and Verify the Data Dependency Graph (DDG) • Define and select the data slice of interest to cover for developing the test case • Develop the test case by selecting/deciding on what input values to use (the key to DFT) • Execute the test case and analyze the result.

A simple Data Dependency Graph(DDG) • Instead of control flow as in execution paths, we use the data and depict the “flow of data” or “relation among data” • Example: integer x, y, z; input x, y; z = y + x; x y z - The nodes (circle) represent data items. - The links (arrow) represent the flow of x and y to z ordependency of z on x and y

Characterizing Data Dependency Graph(DDG) • We are mostly interested in D-U relationship when performing DFT: • Each node in a Data Dependency Graph (DDG) represents, a data item, x and the nodes may be classified in 3 ways: • Output or result node of some computation or assignment. This node will most likely express x in terms of (linked-from) some other node. • Input or constant node that represents x as the user provided input or a pre-defined constant. It usually links-to some other node • Intermediate or storage node where x is neither an input or output; x is most likely an intermediate storage (C-Use) to facilitate some computation. It usually will both link-from some other node and link-to some other node. • The relationmodeled is D-U, and the linkage arrow from x to z depicts x “is used by” z.

Data Used in Predicate Node for (P-Use) • Data used for P-use is depicted a little differently: - consider a “segment”: input w; if w ≥ 3 then z = x; else z = y; y w 3 x w<3 w≥3 w≥3 z w≥3 / w<3 • Note that data item, w, is used mainly for P-use. • Note that constant 3 is also used for P-use. • The dotted arrow depicts the relation of P-use

Generating inputs to Drive the Data FlowTest • The basic concept in DFT is to design test cases (with the appropriate inputs) to cover the D-U relationships in the DDG: • C-use • P-use y w 3 x w<3 w≥3 w≥3 z w≥3 / w<3 1) We need to design test cases with data items ‘x’ and ‘y’ “defined” for C-use and different values of data item ‘w’ defined for P-use. 2) Then display ‘z‘ for the test result analysis.

Generic Procedure for DDG Construction • Indentify the output or result data items of interest • Backward chain to resolve (trace) these data items using other data items (both variables and constants)by consulting the source (program, pseudo code, specification). • This backward chain is often called a “slice” ---- a “data slice” • If there is any unresolved data item during the trace, then for that unresolved variable repeat the above steps 1 and 2. Perform this until there is no unresolved data item left. Note - i) with this construction mechanism all the “leaf” nodes at the “top“ of a DDG must be an “input/assignment” data item node or a “constant” data item node - ii) that we may need to stepwise construct several DDG’s for a complete specification - iii) that if we have a node that is disconnected to any DDG’s or nodes, then that is likely a “dead node” which is extraneous and potentially an error

An “awkward” Data Dependency Graph Example 1 • Pseudo code example • 1. int limit = 10; • 2. input y ; • 3. input x ; • 4. for (int i = 1; i < limit ; i++) • 5. { x = x + i; • 6. y = y + i2 ; } • 7. print (“ x = “, x , “y =“, y); 10 i y x i++ limit i 2 x + i y + i2 i < limit print y print x Note the “awkwardness” of DDG when there is a loop

A “awkward” Data Dependency Graph Example 1 • Pseudo code example • 1. int limit = 10; • 2. input y ; • 3. input x ; • 4. for (int i = 1; i < limit ; i++) • 5. { x = x + i; • 6. y = y + i2 ; } • 7. print (“ x = “, x , “y =“, y); 10 i y x i++ limit i 2 x + i y + i2 i < limit print y print x Note the “awkwardness” of DDG when there is a loop ; also look at “i++” more carefully-----

Picking up a “data slice” related to “print y” 1 • Pseudo code example • 1. int limit = 10; • 2. input y ; • 3. input x ; • 4. for (int i = 0; i < limit ; i++) • 5. { x = x + i; • 6. y = y + i2 ; } • 7. print (“ x = “, x , “y =“, y); 10 i y x i++ limit i 2 x + i y + i2 i < limit print y print x Using the DDG to follow the D-U paths in a data slice.

Picking up a “slice” related to “print y” limit = 10 • Pseudo code example • 1. int limit = 10; • 2. input y ; • 3. input x ; • 4. for (int i = 1; i < limit ; i++) • 5. { x = x + i; • 6. y = y + i2 ; } • 7. print (“ x = “, x , “y =“, y); Input y Input x i = 1 i < limit x = x + i y = y +i2 ** Some finds the “control flow” a little easier to follow than the “data dependency” when we are looking at D-U “paths” of data. i = i + 1 print y print x

Data Flow Testing (DFT)