1 / 64

Automated Developer Testing: Achievements and Challenges

Automated Developer Testing: Achievements and Challenges. Tao Xie North Carolina State University In collaboration with Nikolai Tillmann , Peli de Halleux , Wolfram Schulte @Microsoft Research and students @NCSU ASE. Why Automate Testing?. Software testing is important

nowles
Download Presentation

Automated Developer Testing: Achievements and Challenges

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Automated Developer Testing:Achievements and Challenges • Tao Xie • North Carolina State University • In collaboration with Nikolai Tillmann, Peli de Halleux, Wolfram Schulte @Microsoft Research and students @NCSU ASE

  2. Why Automate Testing? • Software testing is important • Software errors cost the U.S. economy about $59.5 billion each year (0.6% of the GDP) [NIST 02] • Improving testing infrastructure could save 1/3 cost [NIST 02] • Software testing is costly • Account for even half the total cost of software development [Beizer 90] • Automated testing reduces manual testing effort • Test execution: JUnit, NUnit, xUnit, etc. • Test generation: Pex, AgitarOne, ParasoftJtest, etc. • Test-behavior checking: Pex, AgitarOne, ParasoftJtest, etc.

  3. Automation in Developer Testing • Developer testing • http://www.developertesting.com/ • Kent Beck’s 2004 talk on “Future of Developer Testing”http://www.itconversations.com/shows/detail301.html • This talk focuses on tool automation indevelopertesting (e.g., unit testing) • Not system testing etc. conducted by testers

  4. ? = Software Testing Setup + Expected Outputs Test inputs Program Outputs Test Oracles

  5. ? = Software Testing Problems + Expected Outputs Test inputs Program Outputs Test Oracles • Test Generation • Generating high-quality test inputs (e.g., achieving high code coverage)

  6. ? = Software Testing Problems + Expected Outputs Test inputs Program Outputs Test Oracles • Test Generation • Generating high-quality test inputs (e.g., achieving high code coverage) • Test Oracles • Specifying high-quality test oracles (e.g., guarding against various faults)

  7. The Recipe of Unit Testing • Three essential ingredients: • Data • Method Sequence • Assertions void Add() { int item = 3; var list = new List(); list.Add(item); var count = list.Count; Assert.AreEqual(1, count); }

  8. The (problem with) Data list.Add(3); • Which value matters? • Bad choices cause incomplete test suites. • Hard-coded values get stale when product code changes. • Why pick a value if it doesn’t matter?

  9. Parameterized Unit Testing [Tillmann&Schulte ESEC/FSE 05] • Parameterized Unit Test = Unit Test with Parameters • Separation of concerns • Data is generated by a tool • Developer can focus on functional specification void Add(List list, int item) { var count = list.Count; list.Add(item); Assert.AreEqual(count + 1, list.Count); }

  10. Parameterized Unit Tests areAlgebraic Specifications • A Parameterized Unit Test can be read as a universally quantified, conditional axiom. void ReadWrite(string name, string data) {Assume.IsTrue(name != null && data != null); Write(name, data);varreadData = Read(name); Assert.AreEqual(data, readData); }  string name, string data: name ≠ null ⋀ data ≠ null ⇒ equals( ReadResource(name,WriteResource(name,data)), data)

  11. Parameterized Unit Testingis going mainstream Parameterized Unit Tests (PUTs) commonly supported by various test frameworks • .NET: Supported by .NET test frameworks • http://www.mbunit.com/ • http://www.nunit.org/ • … • Java: Supported by JUnit 4.X • http://www.junit.org/ Generating test inputs for PUTs supported by tools • .NET: Supported by Microsoft Research Pex • http://research.microsoft.com/Pex/ • Java: Supported by AgitarAgitarOne • http://www.agitar.com/

  12. Test Generation • Human • Expensive, incomplete, … • Brute Force • Pairwise, predefined data, etc… • Random: • Cheap, Fast • “It passed a thousand tests” feeling • Dynamic Symbolic Execution: Pex, CUTE,EXE • Automated white-box • Not random – Constraint Solving

  13. Dynamic Symbolic Execution Choose next path • Code to generate inputs for: Solve Execute&Monitor void CoverMe(int[] a) { if (a == null) return; if (a.Length > 0) if (a[0] == 1234567890) throw new Exception("bug"); } Negated condition a==null F T a.Length>0 T F Done: There is no path left. a[0]==123… F T Data null {} {0} {123…} Observed constraints a==null a!=null && !(a.Length>0) a!=null && a.Length>0 && a[0]!=1234567890 a!=null && a.Length>0 && a[0]==1234567890 Constraints to solve a!=null a!=null && a.Length>0 a!=null && a.Length>0 && a[0]==1234567890

  14. Challenges of DSE • Loops • Fitnex [Xie et al. DSN 09] • Generic API functions e.g., RegEx matching IsMatch(s1,regex1) • Reggae [Li et al. ASE 09-sp] • Method sequences • MSeqGen [Thummalapenta et al. ESEC/FSE 09] • Environments e.g., file systems, network, db, … • Parameterized Mock Objects [MarriAST 09] Opportunities • Regression testing [Taneja et al. ICSE 09-nier] • Developer guidance (cooperative developer testing)

  15. NCSU Industry Tech Transfer • Loops • Fitnex [Xie et al. DSN 09] • Generic API functions e.g., RegEx matching IsMatch(s1,regex1) • Reggae [Li et al. ASE 09-sp] • Method sequences • MSeqGen [Thummalapenta et al. ESEC/FSE 09] • Environments e.g., file systems, network, db, … • Parameterized Mock Objects [Marri AST 09] Applications • Test network app at Army division@Fort Hood, Texas • Test DB app of hand-held medical assistant device at FDA

  16. Pex on MSDN DevLabsIncubation Project for Visual Studio • Download counts (20 months)(Feb. 2008 - Oct. 2009 ) • Academic: 17,366 • Devlabs: 13,022 • Total: 30,388

  17. NCSU Industry Tech Transfer • Loops • Fitnex [Xie et al. DSN 09] • Generic API functions e.g., RegEx matching IsMatch(s1,regex1) • Reggae [Li et al. ASE 09-sp] • Method sequences • MSeqGen [Thummalapenta et al. ESEC/FSE 09] • Environments e.g., file systems, network, db, … • Parameterized Mock Objects [Marri AST 09] Applications • Test network app at Army division@Fort Hood, Texas • Test DB app of hand-held medical assistant device at FDA

  18. Explosion of Search Space There are decision procedures for individual path conditions, but… • Number of potential paths grows exponentially with number of branches • Reachable code not known initially • Without guidance, same loop might be unfolded forever Fitnex search strategy [Xie et al. DSN 09]

  19. DSE Example TestLoop(0, {0}) public boolTestLoop(int x, int[] y) { if (x == 90) { for (int i = 0; i < y.Length; i++) if (y[i] == 15) x++; if (x == 110) return true; } return false; } Path condition: !(x == 90) ↓ New path condition: (x == 90) ↓ New test input: TestLoop(90, {0})

  20. DSE Example TestLoop(90, {0}) public boolTestLoop(int x, int[] y) { if (x == 90) { for (int i = 0; i < y.Length; i++) if (y[i] == 15) x++; if (x == 110) return true; } return false; } Path condition: (x == 90) && !(y[0] ==15) ↓ New path condition: (x == 90) && (y[0] ==15) ↓ New test input: TestLoop(90, {15})

  21. Challenge in DSE TestLoop(90, {15}) public boolTestLoop(int x, int[] y) { if (x == 90) { for (int i = 0; i < y.Length; i++) if (y[i] == 15) x++; if (x == 110) return true; } return false; } Path condition: (x == 90) && (y[0] ==15) && !(x+1 == 110) ↓ New path condition: (x == 90) && (y[0] ==15) && (x+1 == 110) ↓ New test input: No solution!?

  22. A Closer Look TestLoop(90, {15}) public boolTestLoop(int x, int[] y) { if (x == 90) { for (int i = 0; i < y.Length; i++) if (y[i] == 15) x++; if (x == 110) return true; } return false; } Path condition: (x == 90) && (y[0] ==15) && (0 < y.Length) && !(1 < y.Length) && !(x+1 == 110) ↓ New path condition: (x == 90) && (y[0] ==15) && (0 < y.Length) && (1 < y.Length)  Expand array size

  23. A Closer Look TestLoop(90, {15}) public boolTestLoop(int x, int[] y) { if (x == 90) { for (int i = 0; i < y.Length; i++) if (y[i] == 15) x++; if (x == 110) return true; } return false; } We can have infinite paths! Manual analysis  need at least 20 loop iterations to cover the target branch Exploring all paths up to 20 loop iterations is infeasible: 220 paths

  24. Fitnex: Fitness-Guided Exploration [Xie et al. DSN 2009] TestLoop(90, {15, 0}) TestLoop(90, {15, 15}) public boolTestLoop(int x, int[] y) { if (x == 90) { for (int i = 0; i < y.Length; i++) if (y[i] == 15) x++; if (x == 110) return true; } return false; } Key observations: with respect to the coverage target • not all paths are equally promising for branch-node flipping • not all branch nodes are equally promising to flip • Our solution: • Prefer to flip branch nodes on the most promising paths • Prefer to flip the most promising branch nodes on paths • Fitness function to measure “promising” extents

  25. Fitness Function • FF computes fitness value (distance between the current state and the goal state) • Search tries to minimize fitness value [Tracey et al. 98, Liu at al. 05, …]

  26. Fitness Function for (x == 110) public boolTestLoop(int x, int[] y) { if (x == 90) { for (int i = 0; i < y.Length; i++) if (y[i] == 15) x++; if (x == 110) return true; } return false; } Fitness function: |110 – x |

  27. Compute Fitness Values for Paths Fitness Value public boolTestLoop(int x, int[] y) { if (x == 90) { for (int i = 0; i < y.Length; i++) if (y[i] == 15) x++; if (x == 110) return true; } return false; } (x, y) (90, {0}) 20 (90, {15}) 19 (90, {15, 0}) 19 (90, {15, 15}) 18 (90, {15, 15, 0}) 18 (90, {15, 15, 15}) 17 (90, {15, 15, 15, 0}) 17 (90, {15, 15, 15, 15}) 16 (90, {15, 15, 15, 15, 0}) 16 (90, {15, 15, 15, 15, 15}) 15 … Fitness function: |110 – x | Give preference to flip paths with better fitness values We still need to address which branch node to flip on paths …

  28. Compute Fitness Gains for Branches Fitness Value public boolTestLoop(int x, int[] y) { if (x == 90) { for (int i = 0; i < y.Length; i++) if (y[i] == 15) x++; if (x == 110) return true; } return false; } (x, y) (90, {0}) 20 (90, {15})  flip b4 19 (90, {15, 0})  flip b2 19 (90, {15, 15})  flip b4 18 (90, {15, 15, 0})  flip b2 18 (90, {15, 15, 15})  flip b4 17 (90, {15, 15, 15, 0})  flip b2 17 (90, {15, 15, 15, 15})  flip b4 16 (90, {15, 15, 15, 15, 0})  flip b2 16 (90, {15, 15, 15, 15, 15})  flip b4 15 … Fitness function: |110 – x | Branch b1: i < y.Length Branch b2: i >= y.Length Branch b3: y[i] == 15 Branch b4: y[i] != 15 • Flipping Branch b4 (b3) gives us average 1 (-1) fitness gain (loss) • Flipping branch b2 (b1) gives us average 0 fitness gain (loss)

  29. Compute Fitness Gain for Branches cont. • For a flipped node leading to Fnew, find out the old fitness value Fold before flipping • Assign Fitness Gain (Fold – Fnew)for the branch of the flipped node • Assign Fitness Gain (Fnew – Fold )for the other branch of the branch of the flipped node • Compute the average fitness gain for each branch over time

  30. Search Frontier • Each branch node candidate for being flipped is prioritized based on its composite fitness value: • (Fitness value of node – Fitness gain of its branch) • Select first the one with the best composite fitness value • To avoid local optimal or biases, the fitness-guided strategy is integrated with Pex’s search strategies

  31. Object Creation • Pex normally uses public methods to configure non-public object fields • Heuristics built-in to deal with common types • User can help if needed void (Foofoo) { if (foo.Value == 123) throw … [PexFactoryMethod] Foo Create(Bar bar) { return new Foo(bar);}

  32. QuickGraph Example • A graph example from QuickGraph library • interface IGraph • { • /* Adds given vertex to the graph */ • void AddVertex(IVertex v); • /* Creates a new vertex and adds it to the graph */ • IVertexAddVertex(); • /* Adds an edge to the graph. Both vertices should • already exist in the graph */ • IEdgeAddEdge(IVertex v1, Ivertex v2); • } 32 32

  33. Method Under Test • Desired object state for reaching targets 1 and 2: graph object should contain vertices and edges method sequence • Class SortAlgorithm • { • IGraph graph; • public SortAlgorithm(IGraphgraph) { • this.graph= graph; • } • public void Compute (IVertex s) { • foreach(IVertex u in graph.Vertices) • { • //Target 1 • } • foreach(IEdge e in graph.Edges) • { • //Target 2 • } • } • }

  34. Method Under Test • Applying Randoop, a random testing approach that constructs test inputs by randomly selecting method calls Example sequence generated by Randoop VertexAndEdgeProvider v0 = new VertexAndEdgeProvider(); Boolean v1 = false; BidirectionalGraph v2 = new BidirectionalGraph((IVertexAndEdgeProvider)v0, v1); IVertex v3 = v2.AddVertex(); IVertex v4 = v0.ProvideVertex(); IEdge v15 = v2.AddEdge(v3, v4); v4 not in the graph, so edge cannot be added to graph. • Achieved 31.82% (7 of 22) branch coverage • Reason for low coverage: Not able to generate graph with vertices and edges

  35. New MSeqGen Approach • Mine sequences from existing code bases • Reuse mined sequences for achieving desired object states A Mined sequence from an existing codebase VertexAndEdgeProvider v0; boolbVal; IGraphag = new AdjacencyGraph(v0, bVal); IVertex source = ag.AddVertex(); IVertex target = ag.AddVertex(); IVertex vertex3 = ag.AdVertex(); IEdge edg1 = ag.AddEdge(source, target); IEdge edg2 = ag.AddEdge(target, vertex3); IEdge edg3 = ag.AddEdge(source, vertex3); Graph objectincludes both vertices and edges • Use mined sequences to assist Randoop and Pex • Both Randoop and Pex achieved 86.40% (19 of 22) branch coverage with assistance from MSeqGen

  36. Challenges Addressed by MSeqGen • Existing codebases are often large and complete analysis is expensive •  Search and analyze only relevant portions • Concrete values in mined sequences may be different from desired values •  Replace concrete values with symbolic values and use dynamic symbolic execution • Extracted sequences individually may not be sufficient to achieve desired object states •  Combine extracted sequences to generate new sequences

  37. MSeqGen: Code Searching • Problem: Existing code bases are often large and complete analysis is expensive • Solution: • Use keyword search for identifying relevant method bodies using target classes • Analyze only those relevant method bodies Target classes: System.Collections.Hashtable • QuickGraph.Algorithms.TSAlgorithm Keywords: Hashtable, TSAlgorithm Shortnames of target classes are used as keywords

  38. MSeqGen: Sequence Generalization • Problem: Concrete values in mined sequences are different from desired values to achieve target states • Solution: Generalize sequences by replacing concrete values with symbolic values Method Under Test Class A { int f1 { set; get; } int f2 { set; get; } void CoverMe() { if (f1 != 10) return; if (f2 > 25) throw new Exception(“bug”); } } Mined Sequence for A A obj = new A(); obj.setF1(14); obj.setF2(-10); obj.CoverMe(); Sequence cannot help in exposing bug since desired values are f1=10 and f2>25

  39. MSeqGen: Sequence Generalization • Replace concrete values 14 and -10 with symbolic values X1 and X2 Generalized Sequence for A Mined Sequence for A A obj = new A(); obj.setF1(14); obj.setF2(-10); obj.CoverMe(); int x1 = *, x2 = *; A obj = new A(); obj.setF1(x1); obj.setF2(x2); obj.CoverMe(); • Use DSE for generating desired values for X1 and X2 • DSE explores CoverMemethod and generates desired values (X1 = 10 and X2 = 35)

  40. Improvement of State-of-the-Art • Randoop • Without assistance from MSeqGen: achieved 32% branch coverage  achieved 86%branch coverage • In evaluation, help Randoop achieve 8.7% (maximum 20%) higher branch coverage • Pex • Without assistance from MSeqGen: achieved 45% branch coverage  achieved 86%branch coverage • In evaluation, help Pex achieve 17.4% (maximum 22.5%) higher branch coverage 40 40

  41. Test Oracles • Write assertions and Pex will try to break them • Without assertions, Pex can only find violations of runtime contracts causing NullReferenceException, IndexOutOfRangeException, etc. • Assertions leveraged in product and test code • Pex can leverage Code Contracts

  42. ? = Summary:Automated Developer Testing + Expected Outputs Test inputs Program Outputs Test Oracles Division of Labors • Test Generation • Test inputs for PUT generated by tools (e.g., Pex) • Fitnex: guided exploration of paths [DSN 09] • MSeqGen: exploiting real-usage sequences [ESEC/FSE 09] • Test Oracles • Assertions in PUT specified by developers

  43. Thank you http://research.microsoft.com/pex http://pexase.codeplex.com/ https://sites.google.com/site/asergrp/

  44. Code Contracts • http://research.microsoft.com/en-us/projects/contracts/ • Library to state preconditions, postconditions, invariants • Supported by two tools: • Static Checker • Rewriter: turns Code Contracts into runtime checks • Pex analyses the runtime checks • Contracts act as Test Oracle • Pex may find counter examples for contracts • Missing Contracts may be suggested

  45. Example: ArrayList Class invariant specification: public class ArrayList { private Object[] _items; private int _size; ... [ContractInvariantMethod] // attribute comes with Contracts protected void Invariant() { Contract.Invariant(this._items != null); Contract.Invariant(this._size >= 0); Contract.Invariant(this._items.Length >= this._size); }

  46. ParameterizedModels

  47. Unit Testing vs. Integration Testing • Unit test: while it is debatable what a ‘unit’ is, a ‘unit’ should be small. • Integration test: exercises large portions of a system. • Observation: Integration tests are often “sold” as unit tests • White-box test generation does not scale well to integration test scenarios. • Possible solution: Introduce abstraction layers, and mock components not under test

  48. ExampleTesting with Interfaces AppendFormat(null, “{0} {1}!”, “Hello”, “World”);  “Hello World!” .Net Implementation: public StringBuilder AppendFormat( IFormatProvider provider, char[] chars, params object[] args){ if (chars == null || args == null) throw new ArgumentNullException(…); int pos = 0; int len = chars.Length; char ch = '\x0'; ICustomFormatter cf = null; if (provider != null) cf = (ICustomFormatter)provider.GetFormat(typeof(ICustomFormatter)); …

  49. Stubs / Mock Objects • Introduce a mock class which implements the interface. • Write assertions over expected inputs, provide concrete outputs public class MFormatProvider : IFormatProvider { public object GetFormat(Type formatType){ Assert.IsTrue(formatType != null); return new MCustomFormatter(); } } • Problems: • Costly to write detailed behavior by example • How many and which mock objects do we need to write?

  50. Parameterized Mock Objects - 1 • Introduce a mock class which implements the interface. • Let an oracle provide the behavior of the mock methods. public class MFormatProvider : IFormatProvider { public object GetFormat(Type formatType) { … object o = call.ChooseResult<object>(); return o; } } • Result: Relevant result values can be generated by white-box test input generation tool, just as other test inputs can be generated!

More Related