Testing Product Lines of Industrial Size: Advancements in Combinatorial Interaction Testing

Testing Product Lines of Industrial Size:Advancements in Combinatorial Interaction Testing Martin Fagereng Johansen PhD Thesis Defense, 2013-11-05

Industrial Motivation

TOMRA's Reverse Vending Machines • Finale's Financial Reporting Systems • ABB's Configurable Safety Module • Eclipse IDEs – Free and Open Source

About the Eclipse IDE • Initiated and funded by IBM • Widely used by software engineers to develop software • Major competitor to Microsoft Visual Studio • Many third-party extensions

Eclipse IDE – v3.7.0 (Indigo) – An Example of a Software Product Line The problem: Can we gain confidence that any product will work?

Which products are possible? → model its features and their relationships in a→ feature model: 356,352 possibilities!

Today: A Test Suite for Each Feature http://wiki.eclipse.org/Eclipse/Testing http://archive.eclipse.org/eclipse/downloads/drops/R-3.7-201106131736/testResults.php

Overview

Sampling

Faulty Features • Unit tests may find faults inside a single feature. • n test suites required for a product line with n features. • What about faulty cooperation between features? • What if they interact incorrectly?

Interaction Faults • 2-wise interaction fault • reproducible by including 2 specific features • the others do not matter

Interaction Faults • 3-wise interaction fault • reproducible by including 3 specific features • the others do not matter

Empirics Show: • Kuhn et al. 2004: • Almost all bugs can be attributed to the interaction of a few features.

Covering Arrays • Only a few products are needed to cover all simple interactions. • i.e. testing a few well-selected products might reveal almost all bugs • Examples (2-wise testing): • For the "e-shop product line" with 287 features: 21 products • For the Linux kernel with almost 7,000 features: 480 products

Configuring Feature Models • Feature models can be solved by configuration: • …or by satisfying the corresponding Boolean formula: • R ∧ (A ⇒ R) ∧ (B ⇒ R) ∧ (C ⇒ A) ∧ (D ⇒ A)∧ (C ∨ D) ∧ ¬(C ∧D) ∧ (E ⇒ B) ∧ (F ⇒ B) ∧ (E ∨ F) ∧ (D ⇒ E) • e.g. R = 1, A = 1, B = 1, C = 0, D = 1, E = 1, F = 0 • The SAT problem.

State of the art argument • SAT is the classic NP-complete problem. • worst-case analysis (Cook 1971) • Configuring basic feature models • i.e., finding a single product of a product line • SPLE-SAT – Software Product Line Engineering Boolean SATisfiability • Includes only feature models that occur in SPLE. • Argument • SPLE-SAT = SAT, and SAT is NP-complete • i.e., SPLE-SAT is NP-complete • i.e., SPLE-SAT is impractical(unless P=NP, due to Cobham's thesis) • i.e. because sampling involves SPLE-SAT, sampling is impractical.

Our Argument • If SPLE-SAT is impractical: • Configuring a feature model is impractical. • i.e., testing product lines is of no concern. • If we cannot find any products, why care about their quality? • However, if we have a product line with products: • Finding them were practical. • We care about their quality. • i.e., SPLE-SAT is practical. • Also: • If a feature model is too hard to configure then it cannot serve its purpose as an SPLE artifact. • A customer cannot use it to customize a product to their needs. • i.e., SPLE-SAT is practical.

Empirical Investigation: SAT time • SPLE-SAT is very quick. • Even for the largest models. • E.g. The Linux Kernel • Routinely configured by hand.

Conclusions as Venn Diagrams • State of the Art Conclusion: • Our Conclusion: SAT = SPLE-SAT Hard SAT

Sampling: Impractical in Practice

A New, Efficient Algorithm: ICPL ICPL 2

What makes ICPL quick? • Based on a greedy polynomial time approximation algorithm (PTAS) for the set covering problem (SCP) • Chvátal's algorithm (Chvátal 1979) • We know SPLE-SAT is quick. • Strategically run SPLE-SAT often and infer as much as possible. • Utilize modern hardware. • large amounts of memory (128 GB) • truly parallel processing (64 concurrent executions) • Separate out data-parallel sub-algorithms. • ++

Comparison • State of the art: • Our new algorithm (ICPL):

Comparison

Market-focused Sampling

Industrial Context • TOMRA's Reverse Vending Machines:

Feature Model of TOMRA RVM 435,808 possibilities!

The 12 Products in Their Test-lab

Full Sampling was Too Costly • The problem • Too many test-products • Their Need: • Optimize the selection of 12 products. • Our answer: • Model the market situation. • Select the most relevant products according to that model.

Our Model of the Market Situation:"Weighted sub-product lines"

Better Coverage with 12 Products coverage All Inter-actionsInteractions of market t

Interactions With a 3-wise covering array, we get a few products with: With a 2-wise covering array, we get a few products with: TestCSV succeeds for both. Does CSV work without GEF? CSV works with and without Web Tools. Does CSV work with CDT? Etc…

What Eclipse Tests Today: 2-Wise Covering Array:

Test Results – Pair-wise Testing

Possible Causes • Two (or more) features … • access the same resource • have overlapping GUI elements • SWTBot tests • have dependencies that interact wrongly • wait for each other (deadlock) • +++

Potential Faults Found using Existing Test Cases • Strategic application of existing tests revealed potential faults. • Relatively inexpensive to apply. • Raises confidence on success. • Such a large scale, fully reproducible and documented application of a product line testing technique is not found in the existing literature.

Two bugs identified with 5 test cases. Also Applied to the ABB-case

SPLCATool SPLCATool SPLCATool SPLCATool

Future Work • Further empirical study of faults in software product lines. • Complete application to the Eclipse IDE • With test cases for all features; it is possible today! • A good source of further empirics. • A good basis of further improvements. • Even quicker algorithms for covering array generation. • Less memory usage. • Higher degree of parallelism. • Improved test allocation. • Based on specification, model or implementation. • Based on meta-data such as versions.

Summary • SPLE-SAT was investigated. • Realistic feature models are readily configurable. • Encourages the investigation into faster algorithms. • A fast algorithm for sampling. • Enables the use of sampling for product line testing. • Theory and algorithms for market-focused sampling. • One approach for automatic allocation of test cases. • Enables the production of a test report from (1) an implementation, (2) a test case collection and (3) feature model. • An automatic and scalable technique for software product line testing supported by free, open source tooling. • SPLCATool

Testing Product Lines of Industrial Size: Advancements in Combinatorial Interaction Testing