CS5103 Software Engineering

CS5103 Software Engineering Lecture 15 System Testing Testing Coverage

Higher level testing Integration Testing Testing the interaction among a number of interactive components System Testing Testing the system as a whole, considering various environments, external exceptions Acceptance Testing Final user testing Usually on GUI 2

Integration Testing Strategies Big Bang Top down Bottom Up 3

Big Bang! Prepare all relevant components Data, Global variables… Put them together Pray! 4

Usage Scenario of Big Bang Quite common in small projects Requires no extra effort for integration May work well if all interfaces are well-defined … Not likely to happen… 5

Top down strategy Feature decomposition A hierarchical structure of all software features May not require extra efforts: You should use it in requirement and design 6

An example feature decomposition 7

Top down strategy Focus on the validation of the whole system first Test an empty system with no components at first All components are replaced with test doubles, e.g., stubs Gradually add components Until all components are added Advantages Easier to understand the process of the integration May have a working system earlier (or always have a working system): important for modern development, such as XP 8

Top-down strategy: Illustration 9

Top down strategy Disadvantages Requires to write test doubles Since DOC matters in integration, so more complex test doubles is usually required Centralized So the integration cannot be done parallelly 10

Bottom-up strategy Focus on the integration of lowest units at first Start from an unit depending nothing else When all sub-unit of a component are integrated, integrate the component Until the whole system is built 11

Bottom-up strategy: Illustration Bottom Level Subtree Middle Level Subtree Top Level Subtree 12

Bottom-up strategy Advantages No test stub is required Requires test driver but may re-use unit test cases (view the whole component as a unit) Support parallel integration Issues No working system is available until the end Higher risk Require more interactions among sub-teams working on each component 13

System testing Test the system as a whole Usually test against specifications For each item in the specification Work out a test case and a test oracle Test boundary values Test with invalid inputs Test with environment errors 14

Environment issues Building Compile options and configurations Environment variables Third party dependencies Underlying platforms OS, database, application server, browser Compatibility is a huge problem Different platforms (if cross-platform) Different versions of dependencies and platforms Different configurations 15

Acceptance Testing Command Line Writing test scripts With different options and inputs Try invalid cases : missing option parameter, invalid paths, etc. GUI Testing 16

GUI testing Testing Graphics User Interface is easier But harder to automate And harder to compare results with oracles Manual testing is still widely performed for GUI testing Manually explore the user interface Record the steps in the test for future testings Observe the GUI for errors 17

GUI testing Record and Replay Screen record and replay Record click on the screen and keyboard inputs Replay all the clicks and inputs Not robust, affected by screen size, resolution, resize, … Event record and replay Record all the UI events in the UI framework (swing, MFC, android etc.), and outputs (texts) in the UI Re-trigger all the events and compare the UI texts More robust, but requires more preparation and overhead 18

Today’s class Higher Level Testing Software Testing Coverage Code Coverage Input Combination Coverage Mutation Coverage 19

Test Coverage After we have done some testing, how do we know the testing is enough? The most straightforward: input coverage # of inputs tested / # of possible inputs Unfortunately, # of possible inputs is typically infinite Not feasible, so we need approximations… 20

Test Coverage Code Coverage Specification Coverage Model coverage Error coverage 21

Code Coverage Basic idea: Bugs in the code that has never been executed will not be exposed So the test suite is definitely not sufficient Definition: Divide the code to elements Calculate the proportion of elements that are executed by the test suite 22

Code Coverage Criteria Statement (basic block) coverage, are they the same? Branch coverage (cover all edges in a control flow graph), same with basic block coverage? Data flow coverage Class/Method coverage 23

Control Flow Graph How many test cases to achieve full statement coverage? 24

Statement Coverage in Practice Microsoft reports 80-90% statement coverage Safely-critical software must achieve 100% statement coverage Usually about 85% coverage, 100% for large systems is usually very hard 25

Statement Coverage: Example 26

Branch Coverage Cover the branches in a program A branch is consider executed when both (All) outcomes are executed Also called multiple-condition coveage 27

Control Flow Graph How many test cases to achieve full branch coverage? 28

Branch Coverage: Example 29

Branch Coverage: Example An untested flow of data from an assignment to a use of the assigned value, could hide an erroneous computation Even though we have 100% statement and branch coverage 30

Data Flow Coverage Cover all def-use pairs in a software Def: write to a variable Use: read of a variable Use u and Def d are paired when d is the direct precursor of u in certain execution 31

Data Flow Coverage Formula Not easy to locate all use-def pairs Easy for inner-procedure (inside a method) Very difficult for inter-procedure Consider the write to a field var in one method, and the read to it in another method 32

Path coverage The strongest code coverage criterion Try to cover all possible execution paths in a program Covers all previous coverage criteria? Usually not feasible Exponential paths in acyclic programs Infinite paths in some programs with loops 33

Path coverage N conditions 2N paths Many are not feasible e.g., L1L2L3L4L6 X = 0 => L1L2L3L4L5L6 X = -1 => L1L3L4L6 X = -2 => L1L3L4L5L6 34

Control Flow Graph How many paths? How many test cases to cover? 35

Path coverage, not enough • 1. main() { • 2. int x, y, z, w; • 3. read(x); • 4. read(y); • 5. if (x != 0) • 6. z = x + 10; • 7. else • 8. z = 1; • 9. if (y>0) • 10. w = y / z; • 10. else • 11. w = 0; • } Test Requirements: – 4 paths • Test Cases – (x = 1, y = 22) – (x = 0, y = 10) – (x = 1, y = -22) – (x = 1, y = -10) • We are still not exposing the fault ! • Faulty if x = -10 – Structural coverage cannot reveal this error 36

Method coverage So far, all examples are inner-method Quite useful in unit testing It is very hard to achieve 100% statement coverage in system testing Need higher level code element Method coverage Similar to statements Node coverage : method coverage Edge coverage : method invocation coverage Path coverage : stack trace coverage 37

Method coverage 38

Code coverage: summary Coverage of code elements and their connections Node coverage: Class/method/statement/predicate coverage Edge coverage: Branch/Dataflow/MethodInvok Path coverage: Path/UseDefChain/StackTrace 39

Code coverage: limitations Not enough Some bugs can not be revealed even with full path coverage Cannot reveal bugs due to missing code 40

Code coverage: practice Though not perfect, code coverage is the most widely used technique for test evaluation Also used for measure progress made in testing The criteria used in practice are mainly: Method coverage Statement coverage Branch coverage Loop coverage with heuristic (0, 1, many) 41

Code coverage: practice Far from perfect The commonly used criteria are the weakest, recall our examples A lot of corner (they are not so corner if just not found by statement coverage) cases can never be found 100% code coverage is rarely achieved Some commercial software products released with around 60% code coverage Many open source software even lower than 50% 42

Input Combinatioin Coverage Basic idea Origins from the most straightforward idea In theory, proof of 100% correctness when achieve 100% coverage in theory In practice, on very trivial cases Main problems Combinations are exponential Possible values are infinite 43

Input Combination Coverage An example on a simple automatic sales machine Accept only 1$ bill once and all beverages are 1$ Coke, Sprite, Juice, Water Icy or normal temperature Want receipt or not All combinations = 4*2*2 = 16 combinations Try all 16 combinations will make sure the system works correctly 44

Combination Explosion Combinations are exponential to the number of inputs Consider an annual report system with 100 yes/no questions to generate a customized form for you (if you are not eligible for some questions, it will be not on the form) 2100 combinations = about 1030 test cases 45

Observation When there are many inputs, usually only a small number of software inputs are related to each other The previous example: Maybe only icy coke and sprite, but receipt is independent A long term study from NIST (national institute of standardization technology) A combination width of 4 to 6 is enough for detecting almost all errors 46

N-wise coverage Coverage on N-wise combination of the possible values of all inputs Example: 2-wise combinations (coke, icy), (sprite, icy), (water, icy), (juice, icy) (coke, normal), (sprite, normal), … (coke, receipt), (sprite, receipt), … (coke, no-receipt), (sprite, no-receipt), … (coke, no-receipt), (sprite, no-receipt), … (icy, receipt), (normal, receipt) (icy, no-receipt), (normal, no-receipt) 20 combinations in total 47

N-wise coverage Note: One test case may cover multiple N-wise combinations E.g., (Coke, Icy, Receipt) covers 3 2-wise combinations (Coke, Icy), (Coke, Receipt), (Icy, Receipt) Note: 100% N-wise coverage will fully cover N-1-wise coverage For K boolean inputs Full combination coverage = 2k combinations: exponential Full n-wise coverage = 4*k*(k-1)* … *(k-n+1)/n! combinations: polynomial, for 2-wise combination, 2*k*(k-1) 48

N-wise coverage: Example How many test cases for 100% 2-wise coverage of our example (coke, icy, receipt), covers 3 new 2-wise combinations (sprite, icy, no-receipt), cover 3 new … (juice, icy, receipt), covers 2 new … (water, icy, receipt), covers 2 new … (coke, normal, no-receipt), covers 3 new … (sprite, normal, receipt), cover 3 new … (juice, normal, no-receipt), covers 2 new … (water, normal, no-receipt), covers 2 new … 8 test cases covers all 20 2-wise combinations 49

Combination Coverage in Practice 2-wise combination coverage is very widely used Pair-wise testing All pairs testing Mostly used in configuration testing Example: configuration of gcc All lot of variables Several options for each variable For command line tools: add or remove an option 50

CS5103 Software Engineering