490 likes | 593 Views
Cooperative Developer Testing:. How Human and Machine Cooperate to Get Job Done. Tao Xie North Carolina State University In collaboration with Xusheng Xiao @NCSU ASE and Nikolai Tillmann , Peli de Halleux @Microsoft Research and students. Why Automate Testing?.
E N D
Cooperative Developer Testing: How Human and Machine Cooperate to Get Job Done Tao Xie North Carolina State University In collaboration with XushengXiao@NCSU ASE and Nikolai Tillmann, Peli de Halleux@Microsoft Research and students
Why Automate Testing? • Software testing is important • Software errors cost the U.S. economy about $59.5 billion each year (0.6% of the GDP) [NIST 02] • Improving testing infrastructure could save 1/3 cost [NIST 02] • Software testing is costly • Account for even half the total cost of software development [Beizer 90] • Automated testing reduces manual testing effort • Test execution: JUnit, NUnit, xUnit, etc. • Test generation: Pex, AgitarOne, ParasoftJtest, etc. • Test-behavior checking: Pex, AgitarOne, ParasoftJtest, etc.
? = Software Testing Problems + Expected Outputs Test inputs Program Outputs Test Oracles • Test Generation (machine) • Generating high-quality test inputs (e.g., achieving high code coverage) • Test Oracles (human) • Specifying high-quality test oracles (e.g., guarding against various faults)
Test Generation • Human • Expensive, incomplete, … • Brute Force • Pairwise, predefined data, etc… • Random: • Cheap, Fast • “It passed a thousand tests” feeling • Dynamic Symbolic Execution: Pex, CUTE,EXE • Automated white-box • Not random – Constraint Solving
Dynamic Symbolic Execution Choose next path • Code to generate inputs for: Solve Execute&Monitor void CoverMe(int[] a) { if (a == null) return; if (a.Length > 0) if (a[0] == 1234567890) throw new Exception("bug"); } Negated condition a==null F T a.Length>0 T F Done: There is no path left. a[0]==123… F T Data null {} {0} {123…} Observed constraints a==null a!=null && !(a.Length>0) a!=null && a.Length>0 && a[0]!=1234567890 a!=null && a.Length>0 && a[0]==1234567890 Constraints to solve a!=null a!=null && a.Length>0 a!=null && a.Length>0 && a[0]==1234567890
Pex:Visual Studio Power Tool http://research.microsoft.com/projects/pex/ • Download counts (20 months)(Feb. 2008 - Oct. 2009 ) • Academic: 17,366 • Devlabs: 13,022 • Total: 30,388
Challenges of DSE • Loops/path explosion • Fitnex [Xie et al. DSN 09] • Method sequences • MSeqGen [Thummalapenta et al. ESEC/FSE 09] • External methods or environments e.g., file systems, network, db, … • Parameterized Mock Objects [Taneja et al. ASE 10-sp] Opportunities • Regression testing [Taneja et al. ICSE 09-nier] • Manually written unit tests [Thummalapenta et al. FASE 11] • Developer guidance (cooperative developer testing) [Xiao et al. ICSE 11]
Open Source Pexextensions http://pexase.codeplex.com/ Publications:http://research.microsoft.com/en-us/projects/pex/community.aspx#publications
DSE Challenges - Preliminary Study external-method call problems (EMCP) object-creation problems (OCP) Reported EMCPs: 44 Reported OCPs: 18 vs. Real EMCPs: 0 Real OCPs: 5
DSE Challenges - Preliminary Study • object-creation problems (OCP) - 64.79% • external-method call problems (EMCP) - 26.76% • boundary problems – 5.63% • limitations of the used constraint solver – 2.82% Preliminary results show that the total block coverage achieved is 49.87%, with the lowest coverage being 15.54%.
External-Method Call Problems (EMCP) Example • Example 1: • File.Existshas data dependencies on program input • Subsequent branch at Line 1 using the return value of File.Exists. • Example 2: • Path.GetFullPathhas data dependencies on program input • Path.GetFullPaththrows exceptions. • Example3: Stirng.Formatdo not cause any problem
Object-Creation Problems (OCP) Example • To cover true branch at Line 5, tools need to generate sequences of method calls: • Stacks1 = new Stack(); • s1.Push(new object()); • …… • s1.Push(new object()); • FixedSizeStacks2 = new FixedSizeStack (s1); • Most tools cannot generate such sequence • true branch at Line 5has data dependencies on stack.items (List<object>) stack.Count() returns the size of stack.items
Cooperative Developer Testing • Developers provide guidance to help tools achieve higher structural coverage • Apply tools to generate tests • Tools report achieved coverage & problems • Developers provide guidance • ECMP: Instrumentation or Mock Objects • OCP: Factory Methods
Existing Solution ofProblem Identification • Existing solution (e.g., in Pex) • identify all external-method calls in the program • report all the non-primitive object types of program inputs and their fields • Limitations • the number could be high • some identified problem are irrelevant, not causes for the tools not to achieve high structural coverage
DSE Challenges - Preliminary Study Reported EMCPs: 44 Real OCPs: 18 vs. Real EMCPs: 0 Real OCPs: 5
Proposed Approach: Covana • Precisely identify problems faced by tools when achieving structural coverage • Insight • Not-covered branches have data dependency on real problem candidates • Three main steps: • Problem Candidate Identification • Forward Symbolic Execution • Data Dependence Analysis [Xiao et al. ICSE 2011]
Overview of Covana Problem Candidate Identification Program / PUT Generated Test Inputs Runtime Events Forward Symbolic Execution Problem Candidates Coverage Runtime Information Data Dependence Analysis Identified Problems
Overview of Covana Problem Candidate Identification Program / PUT Generated Test Inputs Runtime Events Forward Symbolic Execution Problem Candidates Coverage Runtime Information Data Dependence Analysis Identified Problems
Problem Identification • EMCP Candidate Identification • External-method calls whose arguments have data dependencies on program inputs (e.g., NOT method calls that print constant strings or put a thread to sleep for some time) • OCP Candidate Identification • Only non-primitive argument types (e.g., NOT int, boolean, double)
Example EMCP Candidate Identification Data Dependencies
Example OCP Candidate Identification OCP Candidates: • FixedSizeStack • FixedSizeStack.stack • Stack.items • object
Overview of Covana Problem Candidate Identification Program / PUT Generated Test Inputs Runtime Events Forward Symbolic Execution Problem Candidates Coverage Runtime Information Data Dependence Analysis Identified Problems
Forward Symbolic Execution • Turn elements of problem candidates symbolic • EMCP: return values of external-method calls • OCP: non-primitive program inputs and their fields • Perform symbolic execution (e.g., DSE/Pex) • Collect runtime information • Symbolic expression in branches • Uncaught exceptions
Overview of Covana Problem Candidate Identification Program / PUT Generated Test Inputs Runtime Events Forward Symbolic Execution Problem Candidates Coverage Runtime Information Data Dependence Analysis Identified Problems
Data Dependence Analysis Symbolic Expression: return(File.Exists) == true Element of ECMP Candidate: return(File.Exists) Branch Statement Line 1 has data dependency on File.Exists at Line 1
EMCP Analysis • Data Dependence Analysis: • partially-covered branch statements have data dependencies on EMCP candidates for return values • Exception Analysis: • extract external-method calls from exception trace • the remaining parts of the program after the call site of the external-method call are not covered
Example EMCP Analysis Branch Statement Line 1 has data dependency on File.Existsat Line 1 False branch at Line 1 is not covered File.Existsis reported Code after Line 6 is not covered Path.GetFullPath throws exceptions for all executions Path.GetFullPath is reported
OCP Analysis • Data Dependence Analysis for partially-covered branch statements • data dependencies on non-primitive program input report program input • data dependencies on fields of program input report the object type of field directly??
Example OCP Analysis stack.Count() returns the size of the field stack.items true branch at Line 5 is not covered Report List<object>, the object type of stack.items False Warning!!! an object type of List<object> cannot be used by the tools: not assignable to the field Stack.items by invoking a public constructor or a public setter method of its declaring class Stack!!
FixedSizeStack Field Declaration Hierarchy • FixedSizeStack .stack • Stack.items Field Declaration Hierarchy: reflection can achieve this: first look at all fields of FixedSizeStack, then all fields of FixedSizeStack.stack, and finally Stack.items.
OCP Analysis Algorithm Only program input, report it directly Check whether a field is assignable for its declaring class report its declaring class report the field itself
Implementation • An extension to Pex • identify problem candidates • turn elements of problem candidates symbolic • collect runtime information • Data dependence analyzer • analyze runtime information • Identify problems • Graphic User Interface (GUI) component • show identified problems with detailed analysis information
Evaluation – Subjects and Setup • Subjects: • xUnit: unit testing framework for .NET • 223 classes and interfaces with 11.4 KLOC • QuickGraph: C# graph library • 165 classes and interfaces with 8.3 KLOC • Evaluation setup: • Pex with the implemented extension as our DSE test-generation tool • Apply Pex to generate tests for program under test • Collect coverage and runtime information for identifying EMCPs and OCPs
Evaluation – Research Questions • RQ1: How effective is Covana in identifying the two main types of problems, EMCPs and OCPs? • RQ2: How effective is Covana in pruning irrelevant problem candidates of EMCPs and OCPs?
Evaluations - RQ1: Problem Identification • Covana identifies • 43 EMCPs with only 1 false positive and 2 false negatives • 155 OCPs with 20 false positives and 30 false negatives.
Example Identified OCP ClassStart, Pexachieved block coverage of 2/27 (7.14%) requires the field typeUnderTest of TestClassCommandnot null and to implement at least one interface typeUnderTest is assignable for TestClassCommand . ReportITypeInfoof typeUnderTestas OCP
Evaluations –RQ2: Irrelevant-Problem-Candidate Pruning • Covana prunes • 97.33% (1567 in 1610) EMCP candidates with 1 false positive and 2 false negatives • 65.63% (296 in 451) OCP candidates with 20 false positives and 30 false negatives
Discussion Assisting other structural test-generation approaches • automatic mock object generation: only deal with external-method calls of EMCPs • random approach: assign more possibilities on exploring object types of OCPs • advanced method-sequence-generation approaches (e.g., MSeqGen): only deal with object types of OCPs
? = Software Testing + Expected Outputs Test inputs Program Outputs Test Oracles • Test Generation (machine) • Generating high-quality test inputs (e.g., achieving high code coverage) • Test Oracles (human) • Specifying high-quality test oracles (e.g., guarding against various faults)
Regression Test Generation • Given a method f(x) (old version) and g (x) (new version) , synthesize meta-program branch cov: h(x) := Assert(f(x) == g(x)) if (f(x) != g(x)) throw new Exception(“changed behavior !”); • Complications: • What if x is a non-primitive type? deep clone, method-sequence generation, … • How to compare receiver objects? deep state comparison, … [Tanejaand Xie. ASE 08 SP]
Migrating Pex to the Web/Cloud Try it at http://www.pexforfun.com/ • Engineering Pex for serious games in computer science • Train problem solving/programming skills and abstraction skills Demo
? = Software Testing + Expected Outputs Test inputs Program Outputs Test Oracles • Test Generation (machine) • Generating high-quality test inputs (e.g., achieving high code coverage) • Test Oracles (human) • Specifying high-quality test oracles (e.g., guarding against various faults)
Thank you! Questions ? https://sites.google.com/site/asergrp/
Observation of Path Condition This path condition contains all the required fields, since all of them are assigned symbolic values Path Condition that leads to true branch at Line 5: FixedSizeStack s3 = new ∧Stack s2 = s3.stack ∧ List<object> s1 = s2.items ∧ int s0 = s1._size ∧ s0 == 10
Observation of Path Condition This path condition contains all the required fields, since all of them are assigned symbolic values Path Condition that leads to true branch at Line 5: FixedSizeStack s3 = new ∧Stack s2 = s3.stack ∧ List<object> s1 = s2.items ∧ int s0 = s1._size ∧ s0 == 10
FixedSizeStack Constructing Field Declaration Hierarchy • FixedSizeStack .stack • Stack.items • Extract fields from path conditions and construct a field declaration hierarchy FixedSizeStack s3 = new ∧Stack s2 = s3.stack ∧ List<object> s1 = s2.items ∧ int s0 = s1._size ∧ s0 == 10
Discussion cont. • Static field • initialized inside class • side effecting symbolic analysis by previous tests • Concrete argument for external-method calls • using constant string to access external environment • affecting achieved coverage
Discussion cont. • Other potential issues • argument side effect of external-method calls • control dependency • static analysis • Future work • carry out experiments to evaluate the effectiveness of incorporating these three more features