250 likes | 358 Views
A System to Generate Test Data and Symbolically Execute Programs. Lori A. Clarke Presented by: Xia Cheng. Motivation. Testing program and of the need for automated systems to aid in this process is growing as important problem The limitation of usual testing approach
E N D
A System to Generate Test Data and Symbolically Execute Programs Lori A. Clarke Presented by: Xia Cheng
Motivation • Testing program and of the need for automated systems to aid in this process is growing as important problem • The limitation of usual testing approach • A novel system to generate test data and symbolically execute programs contributes to this area
Limitation of Usual Approach • relying solely on the intuition of programmer • Creation of program assertions • Human interaction • Results may be questionable • Flaw in assertions or limitation in theorem prover, human or machine
Goal of this work • Implements a system to aid the selection of test data and the detection of program errors • Technique used: • Symbolically execute programs • Generate test data as the set of constraints
System Capabilities • Generates test data to drive execution down a program path • Detects nonexecutable program paths • Creates symbolic representations of the program’s output variables as functions of the program’s input variables. • Detects certain types of program errors.
Generating Test Data • Problem 1: • To execute any specified statement in a program is analogous to the halting problem • Solution • Analyze paths that are restricted to a maximum-loop count or number of statements • System requires that the paths be completely specified but leaves the criteria of path selection to the user
Generating Test Data • Problem 2 • To satisfy the conditional statements on the path requires that the system be capable of solving arbitrary systems of inequalities • Solution • inequalities will usually be relatively simple and often linear • Conjugate gradient method
Generating Test Data • Problem 3 • Array subscripts depend on input data A(1)=10 A(2)=0 IF(A(J).LT.5.)… • Solution • to ignore input and output statements except for the read and write variable lists.
System Overview • The subject program is represented by a directed graph call the control flow graph • In order to generate test data for a control path the variable relationship must be determined. • To generate the constraints the path is symbolically executed • whenever a conditional transfer of control is encountered one or more constraints are generated
Generating Test Data • A solution to the set of constraints is test data that will drive execution down the give path • If the set of constraints is inconsistent, then the given path is nonexecutable • Artificial constraints are temporarily created to increase the chance of detecting common programming error
Symbolic execution • executed values not assigned to variables but expression denoting the evolution of the variables. • Each constraint would be passed to an inequality solver to check its inconsistent • For example
Input Values I1->J I2->K SUBROUTING SUB(J,K) J = J + 1 IF (J.GT.K) GO TO 10 J = K – J GO TO 20 10 J = J – K 20 IF (J.GT.-1) GO TO 30 J = -J 30 RETURN END Symbolic execution
Control path (consistent):1-5, 7, 9 I1+1<=I2 SUBROUTING SUB(J,K) J = J + 1 IF (J.GT.K) GO TO 10 J = K – J GO TO 20 10 J = J – K 20 IF (J.GT.-1) GO TO 30 J = -J 30 RETURN END Symbolic execution
Control path (consistent):1-5, 7, 9 J=I2-I1-1 SUBROUTING SUB(J,K) J = J + 1 IF (J.GT.K) GO TO 10 J = K – J GO TO 20 10 J = J – K 20 IF (J.GT.-1) GO TO 30 J = -J 30 RETURN END Symbolic execution
Control path (consistent):1-5, 7, 9 I2-(I1+1) > -1 SUBROUTING SUB(J,K) J = J + 1 IF (J.GT.K) GO TO 10 J = K – J GO TO 20 10 J = J – K 20 IF (J.GT.-1) GO TO 30 J = -J 30 RETURN END Symbolic execution
Control path(inconsistent):1-3, 6-9 I1+1>I2 SUBROUTING SUB(J,K) 1 J = J + 1 IF (J.GT.K) GO TO 10 J = K – J GO TO 20 10 J = J – K 20 IF (J.GT.-1) GO TO 30 J = -J 30 RETURN END Symbolic execution
Control path(inconsistent):1-3, 6-9 J=I1+1-I2 SUBROUTING SUB(J,K) J = J + 1 IF (J.GT.K) GO TO 10 J = K – J GO TO 20 10 J = J – K 20 IF (J.GT.-1) GO TO 30 J = -J 30 RETURN END Symbolic execution
Control path(inconsistent):1-3, 6-9 I1+1-I2<=-1 SUBROUTING SUB(J,K) J = J + 1 IF (J.GT.K) GO TO 10 J = K – J GO TO 20 10 J = J – K 20 IF (J.GT.-1) GO TO 30 J = -J 30 RETURN END Symbolic execution
Structure of the Analysis Program • The system consist of Preprocessor, Symbolic execution, Constraint simplification, Inequality solver • Preprocessor
Preprocessor • Built by data flow analysis program • DAVE translates the subject program into a list of tokens • DAVE creates a data base of information about each program unit • Symbol table, COMMON table, label table, statement flow table
Intermediate Code Phase • What does this phase do? • Before the subject program is analyzed the token list is translated into an intermediate code similar to an assembly language • The intermediate code for each statement is stored in a doubly linked list that is attached to the corresponding node of the control flow graph • Intermediate code representing a conditional statement is attached to the corresponding edge of the graph.
Intermediate Code Phase • For example
Intermediate Code Phase • Advantages • Allows the analysis to be more easily adapted to other languages • Easy to fold constants and simplify the variable representation during analysis • Enable future optimization and detection of parallelism in the code
Path Selection • Static selection • A path is designated by a sequence of subprogram names, statement numbers, and loop counts. • Each path must satisfy the conditions: • It must be a control path • It can enter or return from a subprogram only when the corresponding code contains a procedure reference or return • Whenever a path enters a program unit the initial statement must be the first executable statement in the program unit • For example
Path Selection • Interactive selection (human oriented) • User designates the starting subprogram unit • User chooses one of exit nodes