370 likes | 646 Views
Testing. Xiaojun Qi. Testing. There are two basic types of testing Execution-based testing Non-execution-based testing “V & V” Verification: Determine if the workflow was completed correctly Validation: Determine if the product as a whole satisfies its requirements (specification)
E N D
Testing Xiaojun Qi
Testing • There are two basic types of testing • Execution-based testing • Non-execution-based testing • “V & V” • Verification: Determine if the workflow was completed correctly • Validation: Determine if the product as a whole satisfies its requirements (specification) • Term “Testing” is used instead of “V & V”
Software Quality • The quality of software is the extent to which the product satisfies its specifications. • Defect is a generic term for • Fault: It happens when a human makes a mistake • Failure: It is the observed incorrect behavior of the software product as a consequence of a fault • Error: It is the amount by which a result is incorrect. • Every software professional is responsible for ensuring that his or her work is correct • Quality must be built in from the beginning
Quality Issues:Software Quality Assurance • The members of the SQA group must ensure the quality of the software process and thereby ensure the quality of the product. • Test at the end of each workflow • Test when the product is complete • Quality assurance must be applied to the process itself • Develop various standards to which the software must conform • Establish the monitoring procedures for assuring compliance with those standards.
Quality Issues:Managerial Independence • There must be managerial independence between the development group and the SQA group • Neither group should have power over the other • More senior management must decide whether to • Deliver the faulty product on time, or • Test further and deliver the product late • The decision must take into account the interests of the client and the development organization
Non-Execution-Based Testing • Non-execution-based testing means that testing software without running test cases. • Review software (Walkthrough or Inspection) is a common non-execution-based testing method. Its underlying principles are: • We should not review our own work • Group synergy: A document is painstakingly checked by a team of software professionals with a broad range of skills.
Non-Execution-Based Testing: Walkthroughs • A walkthrough team consists of four to six members • It includes representatives of • The team responsible the current workflow • The manager responsible for the current workflow • The team responsible for the next workflow • The SQA group • The clients
Managing Walkthroughs • The walkthrough is preceded by preparation • A list of items • Items not understood • Items that appear to be incorrect • The walkthrough team is chaired by the SQA representative since • The quality of the product is a direct reflection of the professional competence of the SQA group.
Managing Walkthroughs (Cont.) • In a walkthrough we detect faults for later correction, not correct them • A correction produced by a committee is likely to be of low quality • The cost of a committee correction is too high • Not all items flagged as faults are actually incorrect • A walkthrough should not last longer than 2 hours so there is no time to correct faults
Managing Walkthroughs (contd) • Since verbalization leads to fault finding, there are two ways of conducting a walkthrough: • Participant-driven • Document-driven • Document-driven is likely to be more thorough since: • The majority of faults are spontaneously detected by the presenter • A walkthrough should never be used for performance appraisal
Non-Execution-Based Testing: Inspections • An inspection has five formal steps • An overview of the document to be inspected • The preparation: List the statistics of fault types ranked by frequency • The inspection: Find and document the faults • The rework: The individual responsible for the document resolves all faults and problems noted in the written report. • The follow-up: Every issue has been resolved and all fixes must be checked to ensure that no new faults have been introduced.
Inspections (Cont.) • An inspection team has four members • Moderator: Both manager and leader • A member of the team performing the current workflow • A member of the team performing the next workflow • A member of the SQA group • IEEE standard also recommends: • Reader: Leads the team through the design • Recorder: Produce a written report of the detected faults.
An Essential Component of an Inspection: Fault Statistics • Faults are recorded by severity • Major faults: They cause premature termination or damage a database • Minor faults • Faults are recorded by fault type • Examples of design faults: • Not all specification items have been addressed • Interface faults: Actual and formal arguments do not correspond • Logic faults:
Use of Fault Statistics • We compare current fault rates with those of previous products for a given workflow. • We take action if there are a disproportionate number of faults of a particular type in an artifact • Check other code artifacts and take corrective action • Redesigning from scratch is a good alternative • We carry forward fault statistics to the next workflow • We may not detect all faults of a particular type in the current inspection
Successful Stories on Inspections • IBM inspections showed up • 82 percent of all detected faults (1976); 70 percent of all detected faults (1978); 93 percent of all detected faults (1986) • Switching system • 90 percent decrease in the cost of detecting faults (1986) • JPL • Four major faults, 14 minor faults per 2 hours (1990) ; Savings of $25,000 per inspection; The number of faults decreased exponentially by phase (1992) • However, fault statistics should never be used for performance appraisal
Comparison of Inspections and Walkthroughs • Walkthrough: Two-step, informal process • Preparation • Analysis • Inspections: Five-step, formal process • Overview • Preparation • Inspection • Rework • Follow-up
Strengths and Weaknesses of Reviews (Walkthrough or Inspection) • Strengths: • A review is an effective way to detect a fault. • Faults are detected early in the process • Weaknesses: • Reviews are less effective if the process is inadequate. The two solutions to this weakness are: • Large-scale software should consist of smaller, largely independent pieces • The documentation of the previous workflows has to be complete and available online
Metrics for Inspections • Inspection rate: design pages inspected per hour • Fault density: faults per page inspected or faults per 1000 lines of code (KLOC) inspected • Fault detection rate: the number of faults detected per hour • Fault detection efficiency: number of major and minor faults detected per person-hour • Question: Does a 50 percent increase in the fault detection rate mean that • Quality has decreased? Or • The inspection process is more efficient?
Execution-Based Testing • Organizations spend up to 50 percent of their software budget on testing • But delivered software is frequently unreliable • “Program testing can be a very effective way to show the presence of bugs, but it is hopelessly inadequate for showing their absence” (Dijkstra; 1972) • The only information that can be deduced from the particular test is that the product runs correctly on that particular set of test data.
What Should Be Tested? • Definition of execution-based testing • “The process of inferring certain behavioral properties of the product based, in part, on the results of executing the product in a known environment with selected inputs” • This definition has troubling implications • “Inference”: We have a fault report, the source code, and often nothing else. This makes the inference difficult. • “Known environment”: We never really can know our environment, either the hardware or the software • “Selected inputs”: Sometimes we cannot provide the inputs we want • Solution: • Simulator is needed: A simulator is a working model of the environment in which the product executes.
Items Should be Tested: Utility • The extent to which a user’s needs are met when a correct product is used under conditions permitted by its specifications • Ease of use • Useful functions • Cost effectiveness
Items Should be Tested: Reliability • A measure of the frequency and criticality of product failure • Mean time between failures: How often the product fails • Mean time to repair: How long it takes, on average, to repair the failure • Time (and cost) to repair the results of a failure
Items Should be Tested: Robustness • A function of • The range of operating conditions • The possibility of unacceptable results with valid input • The effect of invalid input
Items Should be Tested: Performance • The extent to which the product meets its constraints with regard to response time or space requirements. • Real-time software is characterized by hard real-time constraints • If data are lost because the system is too slow • There is no way to recover those data
Items Should be Tested: Correctness • A product is correct if it satisfies its output specifications, independent of its use of computing resources, when operated under permitted conditions. • Technically, correctness is • Not necessary • Example: C++ compiler • Not sufficient • Example:trickSort
Importance of the Correctness of Specifications • Incorrect specification for a sort: • Function trickSort which satisfies this specification: Figure 6.1
Importance of the Correctness of Specifications (Cont.) • Corrected specification for the sort:
Testing versus Correctness Proofs • A correctness proof is a mathematical technique for showing that a product is correct, in other words, that it satisfies its specification. • It is an alternative to execution-based testing
Example of a Correctness Proof • Its corresponding flowchart • The code segment to be proven correct
Add the followings: • Input specification • Output specification • Loop invariant • Assertions
Example of a Correctness Proof (Cont.) • Loop Invariant: A mathematical expression that holds irrespective of whether the loop has been executed 0, 1, or many times. • An informal proof (using induction) appears in Section 6.5.1. That is: given the input specification, it can be proven that loop invariant holds. Furthermore, it can be proven that after n iterations the loop terminates and the output specification is satisfied. The code segment therefore is mathematically proven to be correct.
Correctness Proof Mini Case Study • Dijkstra (1972): • “The programmer should let the program proof and program grow hand in hand” • “Naur text-processing problem” (1969) • Episode 1: Naur constructed a 25-line procedure and informally proved its correctness • Episode 2: Reviewer in Computing Reviews found 1 fault in 1970. • Episode 3: London found 3 more faults in 1971. • Episode 4: Goodenough and Gerhart found 3 further faults in 1975. • Lesson: Even if a product has been proved correct, it must STILL be tested.
Three Myths of Correctness Proving • Software engineers do not have enough mathematics for proofs • Most computer science majors either know or can learn the mathematics needed for proofs • Proving is too expensive to be practical • Economic viability is determined from cost–benefit analysis • Proving is too hard • Many nontrivial products have been successfully proved • Tools like theorem provers can assist us
Difficulties with Correctness Proving • Can we trust a theorem prover? • How do we find input and output specifications, and loop invariants or their equivalents in other logics? • What if the specifications are wrong? • We can never be sure that specifications or a verification system are correct
Correctness Proofs and Software Engineering • Correctness proofs are a vital software engineering tool. They are appropriate: • When human lives are at stake • When indicated by cost–benefit analysis • When the risk of not proving is too great • Also, informal proofs can markedly improve the quality of the product • Use theassertstatement
Who Should Perform Execution-Based Testing? • Programmers should not test their own code artifacts since: • Programming is constructive • Testing is destructive • A successful test finds a fault • Solution: • The programmer debugs the module • The programmer does informal testing • The SQA group then does systematic testing
Test Cases • All test cases must be • Planned beforehand, including the expected output, and • Retained afterwards for regression testing • Testing stops only when the product has been irrevocably discarded