230 likes | 428 Views
Establishing Theoretical Minimal Sets of Mutants ICST 2014. Paul Ammann Joint work with Marcio Eduardo Delamaro Jeff Offutt April 1, 2014. Outline. The situation Researchers use mutation analysis to evaluate test selection strategies The problem What do mutation scores mean?
E N D
Establishing Theoretical Minimal Sets of MutantsICST 2014 Paul Ammann Joint work with Marcio Eduardo Delamaro Jeff Offutt April 1, 2014
Outline • The situation • Researchers use mutation analysis to evaluate test selection strategies • The problem • What do mutation scores mean? • The model • Motivating idea: Minimal mutant sets don’t have redundant mutants • Need to define notion of redundancy • Main result: Dynamic subsumption = Minimal mutant sets • Reduced mutation: Is it close to minimal? • Apply model to Siemens suite • Result: Huge gap • Good news: That’s an opportunity!
Researchers Use Mutation Analysis to Evaluate Test Selection Strategies Test Set C Select Test Sets with Test Selection Strategies Carefully Chosen Artifacts Test Set B Test Set A Deep Analysis Measure “Good” Tests Test Selection Strategy C Test Selection Strategy B Test Selection Strategy A Exactly What Does A Score of 91% Mean?
The Problem With Mutation Scores Mutation Scores for 3 Test Sets Evaluate 3 Test Sets with 4 Mutants: A: {t1, t2} B: {t2, t5} C: {t3} Bscores 75% Is that good?
Let’s Add Some More Mutants The same tests kill m3 and m6. We say that T does not distinguish m3 from m6 Every test kills m8 What’s the point? Ditto for m5 and m9 Mutation Scores for 3 Test Sets Evaluate 3 Test Sets with 10 Mutants: A: {t1, t2} B: {t2, t5} C: {t3} NowBscores 90%! Did B just get better?
Let’s Throw Away Some Mutants Evaluate 3 Test Sets with 2 Mutants: A: {t1, t2} B: {t2, t5} C: {t3} Mutation Scores for 3 Test Sets Now B scores 100% Did B get even better?
All Together Now Evaluate 3 Test Sets with Various Mutants: A: {t1, t2} B: {t2, t5} C: {t3} Cumulative Scores Is Blousy or good? What about C?
What Makes a Mutant Redundant? Basic Idea: Throwing away a redundant mutant has no effect on the minimal test sets. Choose M = {m1, m2, m3, m4} Choose T= {t1, t2, t3, t4, t5} Minimal test sets wrtM and T: {t1, t2}, {t1, t3}, {t4} Try removing m4: M4 = M - {m4} Minimal test sets wrtM4 and T: {t1, t2}, {t1, t3}, {t2, t5}, {t3, t5}, {t4} A change, so m4 is not redundant Try removingm3: M3 = M - {m3} Minimal test sets wrtM3 and T: {t1}, {t4} A change, so m3 is not redundant Try removing m1: M1 = M - {m1} Minimal test sets wrtM1 and T: {t1, t2}, {t1, t3}, {t4} No change, so m1is redundant Ditto for M2 = M - {m2}
Minimal Sets of Mutants • Definition • M is minimal if it does not contain redundant mutants • Minimal mutant sets from the definition • Requires computing all minimal test sets, which is NP complete • We need an efficient algorithm for finding minimal mutant sets • Turn to dynamicsubsumption • Subsumption with respect to a test set
Dynamic Subsumption Test set T Tests that kill mj Tests that kill mk Tests that kill mi ✔ ✖ ? ? mi → mj mi → mk
Efficiently Computing Minimal Sets of Mutants • Formally: mxdynamically subsumesmywrtTiff • Some testin T kills mx • Every testin T that kills mx also killsmy • Main result: Mutant set M minimal wrtT = no dynamic subsumption in M • Properties • Only need to consider mutants in pairs • Groups of mutants do not make another mutant redundant • Fast • Every minimal mutant set has the same cardinality • Contrast with minimal test sets
What Does This Mean in Practice? • Apply the definitions to the Siemens test bed • See what happens! • 7 programs • print_tokens • print_tokens2 • replace • schedule • schedule2 • tcas • totinfo • Extensive hand-crafted test set
Test Characteristics • Notes: • 512 is an artifact of the Proteum tool • Our approach applies with any test set • Most tests used were also distinguished • Minimal test set size modest compared to number of tests used
Mutant Characteristics • “Killed Mutants” means those killed by the test set of size 512 • Vast majority of remainder are equivalent • Most mutants are redundant! • Tiny fraction of mutants are actually minimal wrt 512 tests! • print_tokens: Killing the right 28 mutants guarantees killing all 3711
How Good Are Reduced Mutation Strategies? • We considered five approaches to reduced mutation • STMT: Statement deletion (Proteum SSDL) • ROR: Relation operators (Proteum ORRN) • CON: Replace scalars with constants (Proteum CCSR) • 5RND: 5% Random selection of all mutants • SELECT: “Selective” mutation (Proteum OOAN, OLLN, ORRN, OLNG) • Method: • Choose test sets adequate for each reducedmutation approach • wrt test sets analyzed earlier • Compute mutation score • Against all mutants • Against minimal mutant set • Equivalent mutants hand-identified and removed
Reduced Mutation Scores: Raw vs. Minimal • Notes: • Table entries: Raw Mutation Score: Minimal Mutation Score • Raw Reduced mutation scores make test strategies look good • Minimal Reduced mutation scores do not
Closer Look: Raw vs. Minimal for STMT • Raw mutation scores show little variation • Minimal mutation scores show a lot
Reduced Mutation: Mutants vs. Tests • Notes: • Table entries: Number of Mutants : Size of Minimal Test Set • Reduced approaches • Generate many more mutants than minimal • But not nearly enough tests
Closer Look: Mutants and Tests for STMT • STMT usually generates too many mutants • Unfortunately, they aren’t the right ones • Hence, not nearly enough tests
Discussion • Huge gap: Reduced mutation vs. minimal mutant sets • Research opportunity! • The problem with reduced mutation • Reduced approaches don’t consider specific program under test • Maybe it’s time to change that • Can we analyze specific mutants in a specific program? • Problem with minimal mutant sets for practical testing • Need mutation adequate tests to compute minimal mutant sets! • Aren’t we done at that point? • There is a lot we don’t know about minimal mutant sets • Let’s look at an example from yesterday’s Mutation workshop
Subsumption Graph Example: cal() • 31 nodes of indistinguished mutants • 7 nodes of minimal mutants • muJava generated 145 non-equivalent mutants • we only need 7 for given test set • Static analysis can refine this graph
Questions? • Contact: • {pammann, offutt}@gmu.edu • delamaro@icmc.usp.br