330 likes | 366 Views
Learn about regression testing, test case prioritization, coverage-based strategies, and mutation-based prioritization for efficient bug detection during software evolution. Examples and strategies explained in detail.
E N D
CS4723Software Validation and Quality Assurance Lecture 7 Regression Testing
Regression Testing So far Unit testing System testing Test coverage All of these are about the first round of testing Testing is performed time to time during the software life cycle Test cases / oracles can be reused in all rounds Testing during the evolution phase is regression testing 2
Regression Testing When we try to enhance the software We may also bring in bugs The software works yesterday, but not today, it is called “regression” Numbers Empirical study on eclipse 2005 11% of commits are bug-inducing 24% of fixing commits are bug-inducing 3
Regression Example public int[] reverse(int[] origin){ int[] target = new int[origin.length]; int index = 0; while(index < origin.length - 1){ index++; target[origin.length-index] = origin[index]; } return target; } //bug, missing origin[0] public int[] reverse(int[] origin){ int[] target = new int[origin.length]; int index = 0; while(index < origin.length - 1){ index++; target[origin.length-index] = origin[index]; } target[origin.length-1] = origin[0] return target; } Regression, now crash when length of origin is 0 4
Regression Testing Run old test cases on the new version of software It will cost a lot if we run the whole suite each time Try to save time and cost for new rounds of testing Test prioritization Test relevant code Record and replay 5
Test prioritization Rank all the test cases Run test cases according to the ranked sequence Stop when resources are used up How to rank test cases To discover bugs sooner Or approximation: to achieve higher coverage sooner 6
Coverage-based test case prioritization Code coverage based Require recorded code-coverage information in previous testing Combination coverage based Require input model Mutation coverage based Require recorded mutation-killing stats 7
Total Strategy The simplest strategy Always select the unselected test case that has the best coverage 8
Example Consider code coverage on five test cases: T1: s1, s3, s5 T2: s2, s3, s4, s5 T3: s3, s4, s5 T4: s6, s7 T5: s3, s5, s8, s9, s10 Ranking: T5, T2, T1 / T3, T4 9
Additional Strategy An adaption of total strategy Instead of always choosing the test case with highest coverage Choose the test case that result in most extra coverage Starts from the test case with highest coverage 10
Example Consider code coverage on five test cases: T1: s1, s3, s5 T2: s2, s3, s4, s5 T3: s3, s4, s5 T4: s6, s7 T5: s3, s5, s8, s9, s10 Ranking: T5(5), T2(2, s2, s4) / T4(2, s6, s7), T1(1, s1), T3 11
Combination-coverage based prioritization Use combination coverage instead of code coverage Total strategy does not work for combination coverage, why? Use additional strategy (for n-wise combinations) Example: input model: (coke, sprite), (icy, normal), (receipt, not) Test cases: {coke, icy, not}, {coke, normal, not}, {sprite, icy, receipt}, {sprite, normal, receipt} Ranking for 2-wise prioritization: {coke, icy, not}, {sprite, icy, receipt} (+3), {coke, normal, not} (+2), {sprite, normal, receipt} (+2) 12
Combination-coverage based prioritization Multi-wise coverage based prioritization Problem It may be not reasonable to consider combinations on only certain N-wise, (sprite, normal, receipt) > (sprite, icy, receipt) Multi-wise prioritization Select the test case with best additional 1-wise prioritization If there is a tie, go to 2-wise, and then 3-wise, … Results: {coke, icy, not}, {sprite, normal, receipt} (1-wise + 3), {coke, normal, not} (2-wise + 2, 3-wise + 1), {sprite, icy, receipt} (2-wise + 2, 3-wise + 1) 13
Mutation-coverage based prioritization Similar to code coverage based prioritization Run mutation testing for the test suite Use killed mutants of each test case as criteria Work for both total and additional strategy 14
Setting the threshold Prioritization help us to find bugs earlier Due to resource limit, we do not want to execute all test cases The testing should stop at some place in the prioritized rank list Resource limit Money, time Coverage based Cover all/certain percent of statements Cover all/certain percent of n-wise combinations Cover all/certain percent of mutations 15
Test Relevant Code Basic Idea: Only use test cases that cover the changed code Can be combined with test prioritization Give more priority to the test cases that cover more code affected by the change Determine the affected code with program slicing 16
Which test case is better? Consider the following change and test cases • void main() { • int sum, i; • sum = 0; -> sum = 1; • i = read; • if(i >= 12){ • String rep = report(invalid, i); • sendReport(rep) • }else{ • while ( i<11 ) { • sum = add(sum, i); • i = add(i, 1); • } • } • } Test case: 0 Test case: 13 Test case: 0 is better because it covers more code in the forward slice 17
Program slicing Observation The more a test case cover code affected by a change, the results of the test case is more likely to be changed Only test the part that are related to the revision Program slicing: Locating all parts in the code base that will be affected by the value of a variable Program slicing tools: Jslice, CodeSurfer 18
Program slicing Forward slice of variable v at statement s All the code that are either control or data depend on v at statement s Backward slice of variable v at statement s All the code that v at statement s depends on (either control or data dependency) 19
Data Dependencies Data dependencies are the dependency from the usage of a variable to the definition of the variable Example: s1: x = 3; s2: if(y > 5){ s3: y = y + x; //data depend on x in s1 s4: } 20
Control Dependencies Control dependencies are the dependency from the branch basic blocks to the predicate Example: s1: x = 3; s2: if(y > 5){ s3: y = y + x; //control depend on y in s2 s4: } 21
Example: call-site -> actual arguments void main() { int sum, i; sum = 0; i = 1; while ( i<11 ) { sum = add(sum, i); i = add(i, 1); } } 22
Example: program slicing static int add(int a, int b){ return a + b; } 23
Example: Inter-Procedure static int add(int a, int b){ return a + b; } sum = add(sum, i); i = add(i, 1); 24
Program Slicing based Test Selection Retrieve the forward slice of the changed code Select test cases that will cover more statements in the forward slice Test case: 0 Test case: 13 • void main() { • int sum, i; • sum = 0; -> sum = 1; • i = read; • if(i >= 12){ • String rep = report(invalid, i); • sendReport(rep) • } • while ( i<11 ) { • sum = add(sum, i); • i = add(i, 1); • } • } Test case: 0 is better because it covers more code in the forward slice 27
Record and Replay A resource waste in regression testing We change the code a little bit We need to run all the unchanged code in the test execution Record and Replay For all/some of the unchanged modules Do not run the modules Use the results of previous test instead 28
Record and Replay Example Testing an expert system for finance Has two components, UI and interest calculator (based on the inputs from UI) In first round of testing, store as a map the results of interest calculator: (a, b) -> 5%, (a, c) -> 10%, (d, e) -> 7.7% In regression testing, if the change is made on UI, you can rerun the software with the data map Recording more objects means saving more time in regression testing, should we record every object??? 29
Pros & Cons Pros Saving time in regression testing Cons Be careful when recording non-deterministic components E.g., recording getSystemTime(), may conflict with another call Spend a lot of time for recording data maps Stored data map can be too huge When the stored object is changed, the data map requires updates 30
Selection of recorded modules Rules Record time consuming modules So that you save more time The recorded module should be stable E.g., libraries The interface should contain a small data flow E.g., numeric inputs and return values 31
Selection of recorded modules Recording UI Components Recording Internet Components Recording components that will affect real world Sending an email Transfer money from credit cards 32
Review of Regression Testing Test Prioritization Try only the most important test cases Test Relevant Code Try the most relevant test cases Record and Replay Reuse the execution results of previous test cases 33