570 likes | 1.27k Views
System Testing. What is System Testing?. System testing is testing the system as a whole and consists of a set of test types with different focus: Functional testing functional requirements, use cases, “doing the job” Performance testing coping with work load, latency, transaction counts
E N D
System Testing (c) Henrik Bærbak Christensen
What is System Testing? • System testing is testing the system as a whole and consists of a set of test types with different focus: • Functional testing • functional requirements, use cases, “doing the job” • Performance testing • coping with work load, latency, transaction counts • Stress testing • coping with above limit workload, graceful degradation • Configuration testing • deployment and (re)configurations • Security testing • threat resistence • Recovery testing • loss of resources, recovery (c) Henrik Bærbak Christensen
Aim • System testing’s aim is to demonstrate that the system performs according to its requirements. • thus it embody the contradictory aims of • proving that there are defects • make us confident that there are none... • but the focus is more on the latter than in lower levels of testing • Next testing level: • Acceptance test: include costumer • Factory AT (FAT) and Site AT (SAT) • Alpha/Beta test: include mass-market user groups • Alpha: at shop; Beta: at customer, Release Candiate: (c) Henrik Bærbak Christensen
War stories • Major Danish supplier of hardware/software for military air crafts • 1000 hours system testing • Small Danish company serving Danish airports • The ‘factory accept test’ area was called the “wailing wall” (grædemuren ) (c) Henrik Bærbak Christensen
Functional Testing (c) Henrik Bærbak Christensen
Functional Test • Driven by the requirements / specifications • functional tests can thus be planned and test cases defined very early in a project • Often large overlap with top-level integration test cases • however, the changed focus often requires new test cases. (c) Henrik Bærbak Christensen
Coverage • The natural adequacy criteria / coverage metric of functional tests is of course defined by the requirements • requirements coverage • all functional requirements must be achievable by system • use case coverage • all use cases are covered • Technical adequacy may also be interesting • state coverage • all system state and state transitions are covered • function coverage • all “system functions” are covered (c) Henrik Bærbak Christensen
The story is the same • The basic story at the system level resemble all other levels: • Use systematic techniques to define test cases such that • coverage is high • perceived chance of finding defect is high • minimal number of test cases defined • Remember: • Reliability: Probability that a software system will not cause the failure of the system for a specified time under specified conditions. (c) Henrik Bærbak Christensen
Defining test cases • Functional tests are black-box in nature. • Thus BB techniques are applicable • ECs of user input / required output • Valid and invalid input handling • Boundary value analysis (c) Henrik Bærbak Christensen
Strategies for selection at system level • John D. McGregor and David A. Sykes, Practical Guide to Testing Object-Oriented Software, Addison Wesley, 2001. • McGregor and Sykes describe two approaches at system level: • Defect Hunting versus System Use • illustrates the contradictory aims of testing (c) Henrik Bærbak Christensen
Defect Hunting • Defect huntingis based upon the observation that often developers pays less attention to error handling than to normal processing. • Thus the idea is to go hunting defects by trying to trigger failures: provide invalid input/conditions. • the classic is the monkey test: hammer away on the keyboard and see what happens • This has some overlap with performance, stress, and other types of testing mentioned later. (c) Henrik Bærbak Christensen
System Use • Remember the definition of reliability: • Reliability: Probability that a software system will not cause the failure of the system for a specified time under specified conditions. • Thus: Reliability is determined by the way we use the system. (c) Henrik Bærbak Christensen
War Story: Recovery/Exceptions • Warstory • Ragnarok encountered ‘disk full’ during save operation – resulting in lost data in a version control repository (and that is b-a-d!) for a compiler… • ‘disk full’ had never been though of… • [but] • given the scope of the system (research prototype) the effort to get the system running again was lower than the effort to do a systematic testing… • (but this estimate was never made, I was just plain lucky!) (c) Henrik Bærbak Christensen
Operational Profile/Usage Model • If we can make a model of how we use the system we can then focus our testing efforts (read: cost) such that we test those functions that are used the most (read: benefit). • Thus: get as much reliability as cheap as possible (c) Henrik Bærbak Christensen
Operational Profiles • Operational profile • quantitative characterization of how a software system will be used in its intended environment • a specification of classes of inputs and the probability of their occurrence • [Burnstein 12.3] • Burnstein defines a 5 step process to get it: • Customer profile, user profile, system mode profile, functional profile, operational profile... (c) Henrik Bærbak Christensen
Another approach • Binder defines three system testing patterns • Extended Use Case Test • focus on complete requirements coverage • Covered in CRUD • CRUD coverage for all problem domain abstractions • akin to Burnstein’s “function coverage” • Allocate Test by Profile • maximize requirements coverage under budget constraints (c) Henrik Bærbak Christensen
Extended Use Case Test (c) Henrik Bærbak Christensen
EUCT • Intent • Develop an application system test suite by modeling essential capabilities as extended use cases • Based on extended use cases: • Use case augmented with information on • domain of each variable that participates in use case • required input/output relationships among use case variables • (relative frequency of use case) • sequential dependencies among use cases • that is, we need the exact input/output relation... (c) Henrik Bærbak Christensen
Process • Binder describes the process of defining test cases based upon the extended use cases: • identify operational variables • define variable domains • develop operational relations • develop test cases • The is actually another way of saying EC and Boundary Analysis testing – nothing new • identify input dimensions • identify boundaries • develop equivalence classes • develop test cases (c) Henrik Bærbak Christensen
Entry/Exit • Entry criteria • that we have extended use cases • that system has passed integration testing • Exit criteria • use case coverage / requirements coverage (c) Henrik Bærbak Christensen
Real Example (c) Henrik Bærbak Christensen
Covered in CRUD (c) Henrik Bærbak Christensen
CRUD • Intent • Covered in CRUD verifies that all basic operations are exercised for each problem domain object in the system under test. • Context • Antidecomposition axiom once again: system level testing does not achieve full coverage of its units. • CRUD focus on exercising the domain objects by the central Create, Read, Update, Delete operations. (c) Henrik Bærbak Christensen
Strategy • 1) Make a matrix of use cases versus domain object CRUD • 2) Develop test cases for missing ops on any domain object. (c) Henrik Bærbak Christensen
Coverage • Exit criteria • CRUD adequacy (c) Henrik Bærbak Christensen
Allocate Tests by Profile (c) Henrik Bærbak Christensen
Profile testing • Intent • Allocate the overall testing budget to each use case in proportion to its relative frequency, • Context • Rank use cases according to relative frequency • Allocate testing resources accordingly • Develop test cases for each use case until resources allocated are exhausted. • Maximize reliability given the testing budget. • Example from previously • Word’s (save) versus (configure button panel) (c) Henrik Bærbak Christensen
Strategy • Estimate use case frequency • difficult, but reportedly the relative ordering is more important than exact frequency • 90% search, 10% update • 80% search, 20% update • still the testing is heavily skewed towards search testing... • Testing effort proportional to • use case probability (c) Henrik Bærbak Christensen
Example • Design, setup, and run a test: 1h • Test finds a defect: 5% chance • Correcting a defect: 4 hours • Testing budget: 1000h • Total tests: T • Mean time to make a test: 1h + (0.05*4h) = 1.2h • Number of test cases: 1000h / 1.2h = 833 • Allocate 833 test cases according to profile • 50% use case would get 833 * 50% test cases ... (c) Henrik Bærbak Christensen
Notes • Take care of • critical use cases (central for operation, like “save”) • high-freq use case with trivial implementation • many low-freq use cases • merge them • (or maybe they are plainly wrong modelled) • Exit criteria • use case coverage (c) Henrik Bærbak Christensen
Non-functional testing types (c) Henrik Bærbak Christensen
Performance Testing • Does it perform ok? • coping with work load, latency, transaction counts per second, ect... • Test case generation based upon std techs. • Load generators: • load = inputs that simulates a group of transactions • applications that will generate load • multiple users accessing database/webservice • Capture/Replay tactics for realistic load • “Devilish” loads (c) Henrik Bærbak Christensen
Performance Testing • Probes • software units that collect performance data in the production code • “Built-in monitors” tactic • Perhaps tools to analyse probe output to judge if expected output can be confirmed... (c) Henrik Bærbak Christensen
Stress Testing • Stress testing • allocate resources in maximum amounts • flood with requests, flood with processes, fill RAM, fill disk, fill data structures • [fill the event queue with mouse events ; or hammer away on the keyboard (monkey-test)] • accumulated data over period of years • Easy to find example of defects • JHotDraw with 2000 figures… • Java with a couple of 100 threads… (c) Henrik Bærbak Christensen
Configuration/Deployment • Configuration testing / deployment testing • verify product variants on execution environment • I.e. how does the installation program work? Is it complete? Are any files missing? Are all combinations of options working? • What about dependencies to things that will change? • Burnstein • test all combinations/configuration • test all device combinations • poor Microsoft !!! • test performance level is maintained (c) Henrik Bærbak Christensen
Deployment testing • The microsoft platform has given a name to the problems: • ‘DLL hell’ • The Pentium bug • Early release of pentium x86 made wrong floating point calculations (!) • defect was traced to a wrong script for transferring the design models to the machines to generate the hardware masks • To verify that install is correct • system regression tests for ‘normal’ operation • pruned for tests using options not installed. (c) Henrik Bærbak Christensen
Security testing • War Story • SAVOS weather system at Copenhagen Airport • no escape to windows from SAVOS was a requirement • one day we saw a guy playing chess with SAVOS iconized! • we did not know that dbl-click the title bar minimizes the window • Standard security aspects apply but also: • ability of application to access resources • for instance • System DLL’s that user has not permission to access… • Wrong Java Policy file (c) Henrik Bærbak Christensen
Recovery Testing • Recovery testing • loss of resources to verify that it can properly recover • loosing network connection, loosing server in mid transaction, loosing data input from the field • many availability techniques • cold standby, warm standby, hot standby, exception handling, etc. • Areas of interest • Restart: Pending transactions and system state properly re-established • Switch-over: from master to slave system (c) Henrik Bærbak Christensen
Regression Testing • Not a special testing technique but retesting software after a change. • Very expensive for manual tests • researchers try to find the minimal test set to run given some change in a module A • Automated tests fare better • if they are not too heavy to execute... (c) Henrik Bærbak Christensen
Acceptance tests • System tests performed together with the customer (factory accept test) • does system meet users’ expectations? • rehearsal by testers and developers is important • all defects must be documented • Installations test (site accept test) • perform same tests at customer site (c) Henrik Bærbak Christensen
Alpha/Beta tests • The accept and installation tests for the mass-market • α-test: Users at developers’ site • β-test: Users at their own site (c) Henrik Bærbak Christensen
Summary • System testing is concerned with • end user behavior verification,and • performance, stress, configurations, etc. • The BB techniques and WB coverage criteria apply at all levels • consider equivalence classes and boundaries • consider the coverage required (c) Henrik Bærbak Christensen