1.02k likes | 1.2k Views
Chapter 6: Software Verification. Prof. Steven A. Demurjian Computer Science & Engineering Department The University of Connecticut 371 Fairfield Road, Box U-2155 Storrs, CT 06269-2155. steve@engr.uconn.edu http://www.engr.uconn.edu/~steve (860) 486 – 4818 (860) 486 – 3719 (office).
E N D
Chapter 6: Software Verification Prof. Steven A. Demurjian Computer Science & Engineering Department The University of Connecticut 371 Fairfield Road, Box U-2155 Storrs, CT 06269-2155 steve@engr.uconn.edu http://www.engr.uconn.edu/~steve (860) 486 – 4818 (860) 486 – 3719 (office)
Overview of Chapter 6 • Motivation: Goals and Requirements of Verification • Approaches to Verification • Testing • Goals • Theoretical Foundations • Empirical Testing Principles • Testing in the Small/Large • Separation of Concerns and Testing • Concurrent and Real-Time Systems • Object-Oriented Systems • Informal Analysis Techniques • Debugging/Role of Source Code Control • Verifying Software Properties
Motivation: Goals and Requirements • What kind of Assurance do we get through Testing? • Information Assurance (Info Used as Expected) • Security Assurance (Info not Misused) • What Happens in Other Engineering Fields? • Civil Engineering – Design and Build a Bridge • Requirements (Mathematical Formulas) • Modeling (Wind Tunnels + Prototypes) • Practical (Tensile Strength of Steel, Weight Bearing of Concrete/Cement) • When the Bridge is Built and Loaded with (Worst Case) Semis Filled with Cargo in both Directions, it Must Not Fail • Verify Product (Bridge) and Process (Construction) • Reality: All Parties in the Process are Fallible!!
Motivation • Verification in Computing Typically Accomplished by Running Test Cases • Not All Possible Executions of Software Tested • Evaluation of Documentation, User Friendliness, Other Software Characteristics Often Ignored • Verification of Running Deployed Code Difficult (If Not Impossible) • State of CT Insurance Dept. Project • Most Divisions Do Alpha/Beta Testing • One Division Wants to Just Jump Right to System without any Testing • Their Current System is Full of Holes and Allows Incorrect/Inconsistent Data to be Entered • Is this Reasonable in Some Situations?
Motivation • Verification Process Itself Must be Verified • This Means that the Software Process Itself Must be Verified • Consider CMU’s Software Engineering Institute • www.sei.cmu.edu/ • Capability Maturity Model Integration (CMMI) • Many Companies Strive to Attain Certain SEI Level in their Software Management/Development • In addition, Recall Software Qualities • Correctness, Performance, Robustness, Portability, Visibility, etc. • How are These Verified? Empirically? Qualitatively? • Consider Portability: Web App – How do you Make sure it Works for all Browser/OS Platforms?
Motivation • Results of Verification Likely NOT Binary • Don’t get 0 or 1 result – Often Must Assess Result • Errors in Large Systems are Unavoidable • Some Errors are Left and are “Tolerable” • Others are Critical and Must be Repaired • Correctness is Relative Term • Consider the Application, Market, Cost, Impact, etc. • Verification is Subjective and Objective • Subjective: Reusable, Portable, etc. • Objective: • Correctness (Perform Tests) • Performance (Response Time, Resource Usage, etc.) • Portable (Can you try all Compiler/OS Combos?) • Mobile: Does it Work on Every Device?
Approaches to Verification • Testing: Experimenting with Product Behavior • Explore the Dynamic Behavior • Execute SW Under Different Conditions • Seek Counter-examples/Potential Failure Cases • Detail Scenarios of Usage/Test Scenarios • Involve Customer – Who Knows Domain, Business Logic, etc., to Formulate Test Cases • Analysis: Examining Product w.r.t. Design, Implementation, Testing Process, etc. • Deduce Correct SW Operation as a Logical Consequence of Design Decisions, Input from Customer, etc. • Static Technique – But Impossible to Verify if SW Engineers Correctly and Precisely Translated Design to Working Error-Free Code
Testing • Brief Motivation – Content and Techniques • Four Goals of Testing • Theoretical Foundations • Formalizing Program Behavior • Testing as Input that Produces Output • Empirical Testing Principles • How Does Programming Language Influence Testing • Testing in the Small/Large • Separation of Concerns and Testing • Concurrent and Real-Time Systems • Object-Oriented Systems • Users and Domain Specialists – Role in Testing • Case Study of CT Insurance Dept. Project
Motivating Testing • Testing Can Never Consider All Possible Operating Conditions • Approaches Focus on Identifying Test Cases and Scenarios of Access for Likely Behavior • If Bridge OK at 1 Ton, OK < 1 Ton • What is an Analogy in Software? • If a System Works with 100,000 Data Items, it may be Reasonable to Assume it Works for < 100,000 Items • Problems with Software? • Hard to Identify Other Scenarios that Completely Cover All Possible Interactions and Behavior • Software Doesn’t have Continuity of Behavior • Exhibits Correct Behavior in Infinitely Many Cases, but still be Incorrect in some Cases • Ex: C and bitwise or in If Statement Story
Motivating Testing • What’s a Realistic Example? procedure binary-search (key: in element; table: in elementTable; found: out Boolean) is begin bottom := table'first; top := table'last; while bottom < top loop if (bottom + top) rem 2 ≠ 0 then middle := (bottom + top - 1) / 2; else middle := (bottom + top) / 2; end if; if key ≤ table (middle) then top := middle; else bottom := middle + 1; end if; end loop; found := key = table (top); end binary-search if we omit this the routine works if the else is never hit! (i.e. if size of table is a power of 2)
Four Goals of Testing • Dijkstra: “Program testing can be used to show the presence of bugs, but never to show their absence.” • “Notes on Structured Programs,” 1970 • www.cs.utexas.edu/users/EWD/ewd02xx/EWD249.PDF • Classic Article Still True Today • Simply Point to any Major Software Release from OS to Gameboy Games • Pure Testing Cannot Absolutely Prove Correctness • Testing Must be Used in Conjunction with Other Techniques • Need Sound and Systematic Principles
Four Goals of Testing • Goal 1: Testing Must be Based on Sound and Systematic Techniques • Test Different Execution Paths • Provides Understanding of Product Reliability • May Require the Insertion of Testing Code (e.g., Timing, Conditional Compilation, etc.) • Goal 2: Testing Must Help Locate Errors • Test Cases Must be Organized to Assist in the Isolation of Errors • This Facilitates Debugging • This Requires Well Thought Out Testing Paradigm as Software is Developed
What is Conditional Compilation? • Use precompiler command #define and #ifdef • If debugflag defined – prints are in executabl • Prints in Every Procedure/Function to Track Flow and When Error Occurs – Enter w/o Exit #define debugflag 1 main() { #ifdef if debugflag == 1 then {printf(“Entering main code\n”); fflush(stdout);} #endif /* CODE FOR MAIN #ifdef if debugflag == 1 then {printf(“Exiting main code\n”); fflush(stdout);} #endif }
Four Goals of Testing • Goal 3: Testing Must be Repeatable • Same Input Produces the Same Output • Execution Environment Influences Repeatability • if x=0 then return true else return false; • If x not Initialized, Behavior Can’t be Predicated • In C, the Memory Location is Access for a Value (Which Could have Data from a Prior Execution) • Goal 4: Testing Must be Accurate • Depends on the Precision of SW Specification • Test Cases Created from Scenarios of Usage • Mathematical Formulas for Expressing SW can Greatly Assist in Testing Process • Remember Logic Specs in Chapter 5…
Theoretical Foundations of Testing • Let P be a Program with Domain D and Range R • P: D R (may be partial) • P is a Function • Let OR Denote Output Requirements as Stated in P’s Specification • Correctness Defined by: OR D R • Let d be an Element in D • P(d) correct if <d, P(d)> OR • P correct if all P(d) are correct • Failure: For some d, P(d) does not Satisfy OR which Indicates an Error or Defect (P returns Wrong Result) • Fault: Incorrect Intermediate State of Program Execution (P Fails/Crashes)
Theoretical Foundations of Testing • Failure: For some d, P(d) does not Satisfy OR which Indicates an Error or Defect – Possibilities Include: • Undefined Result (Error State) • Wrong/Incorrect Result • Error: There is a Defect that Causes a Failure • Typing Mistake (x typed instead of y) • Programmer Forgot a Condition (x=0) • Fault: Incorrect Intermediate State of Program Execution
Theoretical Foundations of Testing • Test Cases • Key Issue is to Identify the Individual Test Cases and the Set of Tests that are in Domain D • Tests can be Designed to both: • Test Successful Outcomes (Expected Positives) • Test Incorrect Outcomes (Expected Failures for d in D) • Test Case t is an Element of D • Test Set T is a Finite Subset of D • Test t is Successful if P(t) is Correct • Test Set is Successful if P Correct for all t in T • Test Set T is Ideal if P is Incorrect, there Exists d in T such that P(d) is Incorrect
Theoretical Foundations of Testing • Test Criterion C Defines Finite Subsets of D: C 2D • Test Set T Satisfies C if it is an Element of C, e.g.: • C = {<x1, x2,..., xn> | n 3 i, j, k, ( xi<0 xj=0 xk>0)} • Two test Sets that Satisfy C: • <-5, 0, 22> • <-10, 2, 8, 33, 0, -19> • <1, 3, 99> Does not Satisfy C – Why?
Theoretical Foundations of Testing • Properties of Criteria • C is Consistent • For any Pairs T1, T2 Satisfying C, T1 is Successful iff T2 is Successful • Either of them Provides the “same” Information • C is Complete • If P is Incorrect, there is a test set T of C that is not Successful • C is Complete and Consistent • Identifies an ideal test set • Allows Correctness to be proved! • Very Difficult to Achieve in Practice for “Reasonably” Size Complex Applications • May be Required for Some Domains
Theoretical Foundations of Testing • What are Potential Problems/Roadblocks? • Impossible to Derive Algorithm that States Whether a Program, Test Set, or Criterion Satisfies the Prior Definitions • Impossible to Determine if d in in a Test Set T • Impossible to Determine an Ideal Test Set • Not Decidable (CSE3500) Whether a Test Set Satisfies Criteria or Not • As a Result, Full Mechanization (Automation) of Testing is Not Possible • Instead, Testing Requires Common Sense, Ingenuity, Domain Knowledge (User Community), and Most Critically: A Methodological Approach!
Empirical Testing Principles • Leverage Theoretical Concepts for a Systematic Methodological Approach to Empirical Testing • Only Exhaustive Testing can Prove Correctness with Absolute Certainty (Usually Infeasible) • Overall Goal: Run “Sufficient” Number of Tests that have the Potential to Uncover Errors if X > Y then max := X; else max := Y; endif; /* Test Set Below Detects the Error */ {x = 3, y = 2, x = 2, y = 3} /* Test Set Below Does Not */ {x = 3, y = 2, x = 4, y = 3, x = 5, y = 1}
Empirical Testing Principles • Testing Criterion are Need in Practice to Define Significant Test Cases • Group Domain Elements by Expected Behavior • Representative Test Case from Each Group • Complete Coverage Principle: Choose Test Cases so that the Union of all Test Sets Cover D n in input value. n < 0 print error message 0 ≤ n < 20 print n! 20 ≤ n ≤ 200 print n! in FP format n > 200 print error message
Complete Coverage Principle • Try to Group Elements of D into Subdomains D1, D2, …, Dn where any Element of each Di is likely to have Similar Behavior • D = D1 D2 … Dn • Select one test as a Representative of the Subdomain • If Dj Dk for all j, k (Partition), any Element can be chosen from each Subdomain • Otherwise choose Representatives to Minimize number of tests, yet Fulfilling the Principle
Empirical Testing Focus • Testing in the Small: Test Individual “Small” Pieces • Testing a “Small” Module, a Class, or Methods of a Class • Testing in the Large: Test Larger Scale Modules (Collections of Pieces) • Testing an Inheritance Hierarchy or Set of Related Classes or a “Java Bean” or a Component • Both are Achieved via: • BLACK BOX (Functional) Testing • Partition based on the Module’s Specification • Tests what the program is supposed to do • WHITE BOX (Structural) Testing • Partition based on Module’s Internal Code • Tests what the program does
Testing in the Small • WHITE BOX (Structural) Testing • Testing Software Using Information about Internal Structure – May Ignore the Specification • Tests what the Software Actually Does • Performed Incrementally During Creation of Code • Relies on Statements, Structure of Code Itself • BLACK BOX (Functional) Testing • Testing without Relying on Way that Code Designed and Coded – User/Domain Testing • Evaluated Against the Specification • Tests what the Software supposed to do • Tests by Software Engineering and Domain User • May be Performed as Part of Verification
Testing in the Small – White Box Testing • Focus on Structural Coverage Testing of the Code Itself w.r.t. Various Statements and Execution Paths • Consider the Code: • Testing Must Consider • While Loop • Conditional • Each May Require Different Test Cases • We’ll Consider Control Flow Coverage Criteria • Statement coverage • Edge coverage • Condition coverage • Path coverage begin read (x); read (y); while x ≠ y loop if x > y then x := x - y; else y := y - x; end if; end loop; gcd : = x; end;
Statement Coverage Criterion • Test Software to Execute All Statements at Least Once • Formally: • Select a test set T s.t. every Elementary Stmt. in P is Executed at least once by some d in T • Objective:Try to Minimize the Number of Test Cases still Preserving the Desired Coverage read (x); read (y); if x > 0 then write ("1"); else write ("2"); end if; if y > 0 then write ("3"); else write ("4"); end if; {<x = 2, y = 3>, <x = - 13, y = 51>, <x = 97, y = 17>, <x = - 1, y = - 1>} covers all statements {<x = - 13, y = 51>, <x = 2, y = - 3>} is minimal
Problem with Statement Coverage • A Particular Test Case While Covering All Statements May not Fully Test the Software • Solution: Rewrite the Code as: if x < 0 then x := -x; end if; z := x; {x=-3} covers all statements but does not exercise the case when x is positive and the then branch is not entered if x < 0 then x := -x; else null; end if; z := x; Coverage requires you to test both x < 0 and x >= 0 for completeness.
Edge Coverage Criterion • Informally: • Consider Each Program as a Control Flow Graph that Represents the Overall Program Structure • Edges Represent Statements • Nodes at the Ends of an Edge Represent Entry into the Statement and Exit • Intent: Examine Various Execution Paths to Make sure that Every Edge is Visited at Least Once • Formally: • Select Test set T such that Every Edge (Branch) of the Control Flow is Exercised at least once by some d in T • Overall: Edge Coverage is Finer Grained than Statement Coverage
Edge Coverage Criterion Graphs G G G 2 1 1 I/O, assignment, or procedure call if-then-else if-then G 1 two sequential statements G 1 G 2 while loop
Simplification Possible A Sequence of Edges can be Collapsed into just one edge
Example: Euclid’s Algorithm Code and its … Corresponding Control Flow Graph begin read (x); read (y); while x ≠ y loop if x > y then x := x - y; else y := y - x; end if; end loop; gcd : = x; end;
Problem with Edge Coverage found := false; counter := 1; while (not found) and counter < number_of_items loop if table (counter) = desired_element then found := true; end if; counter := counter + 1; end loop; if found then write ("the desired element is in the table"); else write ("the desired element is not in the table"); end if; Test cases that Satisfy Edge Coverage: (1) empty table (2) table with 3 items, second of which is the item to find DOES NOT DISCOVER ERROR OF (< instead of ≤)
Condition Coverage Criterion • Informally: • Utilize the Control Flow Graph Expanded with Testing of Boolean Expressions in Conditionals • Intent: Expand Execution Paths with Values • Formally: • Select a test set T s.t. every edge of P’s Control Flow is traversed (Edge Coverage) and • All Possible Values of the Constituents of Compound Conditions are Exercised at Least Once • Overall: Condition Coverage is Finer Grained than Edge Coverage
Condition Coverage found := false; counter := 1; while (not found) and counter < number_of_items loop if table (counter) = desired_element then found := true; end if; counter := counter + 1; end loop; if found then write ("the desired element is in the table"); else write ("the desired element is not in the table"); end if; Expand with Test Cases Related to found, counter < number_of_items, etc. (1) counter less than number_of_items, equal to, greater than (2) if equality satisfied or not (3) etc.
Problem with Condition Coverage if x ≠ 0 then y := 5; else z := z - x; end if; if z > 1 then z := z / x; else z := 0; end if; {<x = 0, z = 1>, <x = 1, z = 3>} causes the execution of all edges for each condition, but fails to expose the risk of a division by zero
Path Coverage Criterion • Informally: • Utilize the Control Flow Graph Expanded with Testing of Boolean Expressions in Conditionals • Expanded to Include All Possible “Paths” of Execution through Control Flow Graph • Don’t Just Cover Every Edge, but Explore all Alternate Paths from Start to Finish • Formally: • Select a Test Set T which Traverses all Paths from Initial to the Final Node of P’s Control Flow • Overall: Path Coverage Finer that All Others so far… • But: • Amount of Possibilities Prohibitively Large • Impossible to Check All Possibilities - Exponential
Example of Path Coverage if x ≠ 0 then y := 5; else z := z - x; end if; if z > 1 then z := z / x; else z := 0; end if; {<x = 0, z = 1>, <x = 1, z = 3>} Covers Edges but Not All Paths {<x = 0, z = 3>, <x = 1, z = 1>} Tests all Execution Paths
What is a Strategy for a Search of a Table? • Skip the Loop - number_of_items = 0 • Execute a Loop once or twice (find element early) • Execute the Loop to search the entire table found := false; counter := 1; while (not found) and counter < number_of_items loop if table (counter) = desired_element then found := true; end if; counter := counter + 1; end loop; if found then write ("the desired element is in the table"); else write ("the desired element is not in the table"); end if;
Guidelines for White-Box Testing • Testing Loops • Execute Loop Zero Times • Execute Loop Maximum Number of Times • Execute Loop Average Number of Times • Think about Individual Loops, How they Work (test condition at top or bottom), and are Used • Testing Conditionals (If and Case Statements) • Always Cover all Edges • Expand to Test Critical Paths of Interest • Other Considerations • Choose Criterion and Then Select Input Values • Select Criterion Based on The Code Itself • Different Criteria may be Applied to Different Portions of Code
Summary: Problems with White-Box Testing • Syntactically Indicated Behaviors (Statements, Edges, Etc.) are Often Impossible • Unreachable Code, Infeasible Edges, Paths, Etc. • An Unreachable Statement Means 100% Coverage Never Attained! • Adequacy Criteria May be Impossible to Satisfy • Manual Justification for Omitting Each Impossible Test Case • Adequacy “Scores” Based on Coverage • Example: 95% Statement Coverage • Other Possibilities: • What if Code Omits Implementation of Some Part of the Specification? • White Box Test Cases Derived from the Code Will Ignore that Part of the Specification!
Module and Unit Testing Tools • Myriad of Products Available • junitjunit.sourceforge.net/ • Objective-C OCUnitcocoadev.com/wiki/OCUnit • Check for C check.sourceforge.net/ • C/C++ Google Test • MS Test for Visual Studio • Mobile Platforms • Titanium Recommends jsunity (Javascript) • Sensa Touch (Selenium, CasperJS, Siesta)
Testing in the Small – Black Box Testing • Treat Class, Procedure, Function, as a Black Box • Given “What” Box is Supposed to Do • Understand its Inputs and Expected Outputs • Execute Tests and Assess Results • Formulate Test Cases Based on What Program is Supposed to Do without Knowing • Programming Paradigm (OO, Functional, etc.) • Code Structure (Modularity, Inheritance, etc.) Class Procedure Function Expected Outputs Inputs
Consider Sample Specification • The program receives as input a record describing an invoice. (A detailed description of the format of the record is given.) • The invoice must be inserted into a file of invoices that is sorted by date. • The invoice must be inserted in the appropriate position • If other invoices exist in the file with the same date, then the invoice should be inserted after the last one. • Also, some consistency checks must be performed • The program should verify whether the customer is already in a corresponding file of customers, whether the customer’s data in the two files match, etc.
What are the Potential Test Cases? • An Invoice Whose Date is the Current Date • An Invoice Whose Date is Before the Current Date(This Might Be Even Forbidden By Law)Possible Sub-test Cases • An Invoice Whose Date is the Same as Some Existing Invoice • An Invoice Whose Date Does Not Exist in Any Previously Recorded Invoice • Several Incorrect Invoices, Checking Different Types of Inconsistencies
Test Scenarios and Cases • Participators in Testing • User/Domain Specialists to Formulate Test Cases • Software Engineers Involved in Specification and Design • Software Developers • Software Testers • Sample Testing • State of CT Insurance Department Project • Constant Renewal of Agents and Agencies • Renewal Scenarios to Process “Batches” • Single vs. Multiple Renewals • Scan Slip of Paper (1/3 sheet with Bar Code) + Check • Develop Scenarios, Testing Procedures, and Cases • See Course Web Page…
Four Types of Black Box Testing • Testing Driven by Logic Specifications • Utilizes Pre and Post Conditions • Syntax-Driven Testing • Assumes Presence of Underlying Grammar that Describes What’s in Box • Focus on Testing Based on Grammar • Decision Table Based Testing • Input/Output or Input/Action Combinations Known in Advance – Outcome Based • Cause-Effect Graph Based Testing • If X and Y and … then A and B and … • Advance Knowledge of Expected Behavior in Combination
Logic-Specification Based Testing • Consider Logic Specification of Inserting Invoice Record into a File • As Written, Difficult to Discern What to Test for all x in Invoices, f in Invoice_Files {sorted_by_date(f) and not exist j, k (j ≠ k and f(j) =f(k)} insert(x, f) {sorted_by_date(f) and for all k (old_f(k) = z implies exists j (f(j) = z)) and for all k (f(k) = z and z ≠ x) implies exists j (old_f(j) = z) and exists j (f(j). date = x. date and f(j) ≠ x) implies j < pos(x, f) and result ºx.customer belongs_to customer_file and warning º(x belongs_to old_f orx.date < current_date or....) }
Logic-Specification Based Testing • Apply Coverage Criterion to Post Condition • Rewrite as Below – Easier to Formulate Tests TRUE implies sorted_by_date(f) and for all k old_f(k) = z implies exists j (f(j) = z) and for all k (f(k) = z and z ≠ x) implies exists j (old_f(j) = z) and (x.customer belongs_to customer_file)impliesresult and not (x.customer belongs_to customer_file and...) implies not result and x belongs_to old_y implies warning and x.date < current_date implies warning and ....
Syntax-Driven Testing • Applicable to Software Whose Input is Formally Described by a Grammar • Compiler, ATM Machine, etc. • Recall State Machines – Know Allowable Combinations in Advance • Requires a Complete Formal Specification of Language Syntax • Specification is Utilized to Generate Test Sets • Consider ATM Machine with Formal Steps:<validate pin> ::= <scan card> <enter pin><withdraw> ::= <enter amt> <check balance> <dispense> or <enter amt> <check balance> <deny>