Informatics 43 Introduction to Software Engineering

Informatics 43Introduction to Software Engineering Lecture 8 Duplication of course material for any commercial purpose without the explicit written permission of the professor is prohibited.

Today’s Lecture • Quality assurance • Testing • Structural Testing • Specification-based Testing

What Do These Have in Common? • Airbus 320 • Audi 5000 • Mariner 1 launch • AT&T telephone network • Ariane 5 • Word 3.0 for MAC • Radiation therapy machine • NSA • Y2K

They All Failed! • Airbus 320 • Audi 5000 • Mariner 1 launch • AT&T telephone network • Ariane 5 • Word 3.0 for MAC • Radiation therapy machine • NSA • Y2K

They All Failed! • Airbus 320 • http://catless.ncl.ac.uk/Risks/10.02.html#subj1.1 • Audi 5000 • “unintended” acceleration problem • A figure of speech: “We’re Audi 5000!” • Mariner 1 launch • http://catless.ncl.ac.uk/Risks/5.73.html#subj2.1 • AT&T telephone network • Ripple effect, from switch to switch, network down/dark for 2-3 days • Ariane 5 • http://catless.ncl.ac.uk/Risks/18.24.html#subj2.1 • Word 3.0 for MAC • “Plagued with bugs”, replaced for free later Word 3.0.1 • Radiation therapy machine • http://courses.cs.vt.edu/~cs3604/lib/Therac_25/Therac_5.html • NSA • Spy computer crash, system down/dark for a couple of days • Y2K

Impact of Failures • Not just “out there” • Space shuttle • Mariner 1 • Ariane 5 • NSA • But also “at home” • Your car • Your call to your mom • Your wireless network, social network, mobile app • Your homework • Your hospital visit Peter Neumann’s Risks Digest: http://catless.ncl.ac.uk/Risks

Verification and Validation • Verification • Ensure software meets specifications • Internal consistency • “Are we building the product right?” • Validation • Ensure software meets customer’s intent • External consistency • “Are we building the right product?”

Software Qualities Correctness Reliability Efficiency Integrity Usability Maintainability Testability Flexibility Portability Reusability Interoperability Performance, etc.

Quality Assurance • Assure that each of the software qualities is met • Goals set in requirements specification • Goals realized in implementation • Sometimes easy, sometimes difficult • Portability versus safety • Sometimes immediate, sometimes delayed • Understandability versus evolvability • Sometimes provable, sometimes doubtful • Size versus correctness

An Idealized View of QA Complete formal specificationof problem to be solved Correctness-preserving transformation Design, in formal notation Correctness-preserving transformation Code, in verifiable language Correctness-preserving transformation Executable machine code Correctness-preserving transformation Execution on verified hardware

A Realistic View of QA Mixture of formal andinformal specifications Manual transformation Design, in mixed notation Manual transformation Code, in C++, Java, Ada, … Compilation by commercial compiler (Intel Pentium-based) machine code Commercial firmware Execution on commercial hardware

First Complication Real needs ActualSpecification “Correct”Specification No matter how sophisticated the QA process, the problem of creating the initial specification remains

Second Complication • Complex data communications • Electronic fund transfer • Distributed processing • Web search engine • Stringent performance objectives • Air traffic control system • Complex processing • Medical diagnosis system Sometimes, the software system is extremelycomplicated making it tremendously difficult to perform QA

Third Complication ProjectManagement Quality AssuranceGroup DevelopmentGroup It is difficult to divide the particular responsibilitiesinvolved when performing quality assurance

Fourth Complication • Quality assurance lays out the rules • You will check in your code every day • You will comment your code • You will… • Quality assurance also uncovers the faults • Taps developers on their fingers • Creates image of “competition” • Quality assurance is viewed as cumbersome, “heavy” • “Just let me code” Quality assurance has a negative connotation

Available Techniques • Formal program verification • Static analysis of program properties • Concurrent programs: deadlock, starvation, fairness • Performance: min/max response time • Code reviews and inspections • Testing Most techniques are geared towards verifying correctness

Reminder: Use the Principles • Rigor and formality • Separation of concerns • Modularity • Abstraction • Anticipation of change • Generality • Incrementality

Testing • Exercise a module, collection of modules, or system • Use predetermined inputs (“test case”) • Capture actual outputs • Compare actual outputs to expected outputs • Actual outputs equal to expected outputs  test case succeeds • Actual outputs not equal to expected outputs  test case fails

V-Model of Development and Testing Develop Requirements Execute System Tests Requirements Review Develop Acceptance Tests Acceptance Test Review Design Execute Integration Tests Design Review Develop Integration Tests Integration Tests Review Code Execute Unit Tests Code Review Develop Unit Tests Unit Tests Review

Testing Terminology • Failure • Incorrect or unexpected output • Symptom of a fault • Fault • Invalid execution state • Symptom of an error • May or may not produce a failure • Error • Defect or anomaly in source code • Commonly referred to as a “bug” • May or may not produce a fault

Testing Goals • Reveal failures/faults/errors • Locate failures/faults/errors • Show system correctness • Within the limits of optimistic inaccuracy • Improve confidence that the system performs as specified (verification) • Improve confidence that the system performs as desired (validation) Program testing can be used to show the presenceof bugs, but never to show their absence [Dijkstra]

Levels of Testing • Unit testing • Testing of a single code unit • Requires use of test drivers • Integration testing • Testing of interfaces among integrated units • Incremental • “Big bang” • Often requires test drivers and test stubs • Acceptance testing • Testing of complete system for satisfaction of requirements

Test Tasks • Devise test cases • Target specific areas of the system • Create specific inputs • Create expected outputs • Choose test cases • Not all need to be run all the time • Regression testing • Run test cases • Can be labor intensive • Opportunity for automation All in a systematic, repeatable, and accurate manner

Test Automation • Opportunities • Test execution • Scaffolding • Executing test cases • Most repetitive, non-creative aspect of the test process • Design once, execute many times • Tool support available • jUnitfor java, xUnit in general

Scaffolding • Term borrowed from construction, civil engineering • Additional code to support development • But usually not included or visible in the deployed/shipped code • Not experienced by the end user • Test driver • A function or program (“main”) for driving a test • Test stub • A replacement of the “real code” that’s being called by the program • Test harness • A replacement of any (possibly many) other parts of deployed system

Test Oracles • Provide a mechanism for deciding whether a test case execution succeeds or fails • Critical to testing • Used in white box testing • Used in black box testing • Difficult to automate • Typically relies on humans • Typically relies on human intuition • Formal specifications may help

Oracle Example: Cosine • Your test execution shows cos(0.5) = 0.87758256189 • You have to decide whether this answer is correct? • You need an oracle • Draw a triangle and measure the sides • Look up cosine of 0.5 in a book • Compute the value using Taylor series expansion • Check the answer with your desk calculator

Two Approaches • White box testing • Structural testing • Test cases designed, selected, and ran based on structure of the code • Scale: tests the nitty-gritty • Drawbacks: need access to source • Black box testing • Specification-based testing • Test cases designed, selected, and ran based on specifications • Scale: tests the overall system behavior • Drawback: less systematic

Structural Testing • Use source code to derive test cases • Build a graph model of the system • Control flow • Data flow • State test cases in terms of graph coverage • Choose test cases that guarantee different types of coverage • Node coverage • Edge coverage • Loop coverage • Condition coverage • Path coverage

Example: Building the program graph 1 3 7 2 4 5 6

Example: Averaging homework grades! 1 3 7 10 2 4 5 6 8 9

Node Coverage • Select test cases such that every node in the graph is visited • Also called statement coverage • Guarantees that every statement in the source code is executed at least once • Selects minimal number of test cases Test case: { 2 } 1 3 7 10 2 4 5 6 8 9

Edge Coverage • Select test cases such that every edge in the graph is visited • Also called branch coverage • Guarantees that every branch in the source code is executed at least once • More thorough than node coverage • More likely to reveal logical errors Test case: { 1, 2 } 1 3 7 10 2 4 5 6 8 9

Other Coverage Criteria • Loop coverage • Select test cases such that every loop boundary and interior is tested • Boundary: 0 iterations • Interior: 1 iteration and > 1 iterations • Watch out for nested loops • Less precise than edge coverage • Condition coverage • Select test cases such that all conditions are tested • if (a > b || c > d) … • More precise than edge coverage

Other Coverage Criteria • Path coverage • Select test cases such that every path in the graph is visited • Loops are a problem • 0, 1, average, max iterations • Most thorough… • …but is it feasible?

Challenges • Structural testing can cover all nodes or edges without revealing obvious faults • No matter what input, program always returns 0 • Some nodes, edges, or loop combinations may be infeasible • Unreachable/unexecutable code • “Thoroughness” • A test suite that guarantees edge coverage also guarantees node coverage… • …but it may not find as many faults as a different test suite that only guarantees node coverage

More Challenges • Interactive programs • Listeners or event-driven programs • Concurrent programs • Exceptions • Self-modifying programs • Mobile code • Constructors/destructors • Garbage collection

Specification-Based Testing • Use specifications to derive test cases • Requirements • Design • Function signature • Based on some kind of input domain • Choose test cases that guarantee a wide range of coverage • Typical values • Boundary values • Special cases • Invalid input values

“Some Kind of Input Domain” • Determine a basis for dividing the input domain into subdomains • Subdomains may overlap • Possible bases • Size • Order • Structure • Correctness • Your creative thinking • Select test cases from each subdomain • One test case may suffice

Example

Possible Bases • Array length • Empty array • One element • Two or three elements • Lots of elements Input domain: float[]Basis: array length small one empty large

Possible Bases • Position of minimum score • Smallest element first • Smallest element in middle • Smallest element last Input domain: float[]Basis: position of minima somewhere in middle first last

Possible Bases Input domain: float[]Basis: number of minima all data equal 1 minimum 2 minima • Number of minima • Unique minimum • A few minima • All minima

Testing Matrix

homeworkAverage 1

How to Avoid Problems of Structural Testing • Interactive programs • Listeners or event-driven programs • Concurrent programs • Exceptions • Self-modifying programs • Mobile code • Constructors/destructors • Garbage collection

Informatics 43 Introduction to Software Engineering