410 likes | 654 Views
Transferring an Automated Test Generation Tool to Practice: From Pex to Fakes, Code Digger, and Pex4Fun. Tao Xie University of Illinois at Urbana-Champaign.
E N D
Transferring an Automated Test Generation Tool to Practice: From Pex to Fakes, Code Digger, and Pex4Fun • Tao Xie • University of Illinois at Urbana-Champaign • Part of the research work described in this talk was done in collaboration with the Pex team (Nikolai Tillmann, Peli de Halleux, et al.) @Microsoft Research,students @Illinois ASE, and other collaborators; part of the research work was done by the Pex team only
(Automated) Test Generation • Human • Expensive, incomplete, … • Brute Force • Pairwise, predefined data, etc… • Tool Automation!!
State-of-the-Art/Practice Test Generation Tools Running Symbolic PathFinder ... … ====================================================== results no errors detected ====================================================== statistics elapsed time: 0:00:02 states: new=4, visited=0, backtracked=4, end=2 search: maxDepth=3, constraints=0 choice generators: thread=1, data=2 heap: gc=3, new=271, free=22 instructions: 2875 max memory: 81MB loaded code: classes=71, methods=884 …
Successful Case of MSR Testing Tool: Pex & Relatives • Pex (released on May 2008) • 30,388download# (20 months, Feb 08-Oct 09) • Active user community: 1,436 forum posts during ~3 years (Oct 08- Nov 11) • Moles (released on Sept 2009) • Shippedwith VS 12 as Fakes • “Provide Microsoft Fakes w/ all Visual Studio editions” got 1,457community votes • Code Digger (released on Oct 2008 for VS 08/10, on Apr 2013 in VS Gallery for VS 12) • 22,466 download# (10 months, Apr 13-Jan 14) http://research.microsoft.com/en-us/projects/pex/
Example Comments for Code Digger in VS Gallery • “Greattool to generate unit tests for parameter boundarytests. I like to see it integratedinto Visual Studio and the testing features as far as in ReSharper! :)” • “What an awesometool.. Help us to explore our logic by providing accurateinput parameter for each logic branch.. You should try this as one of your ultimatetool :) It really saves a lot of our time to explore everylogic branch in our apps..”
Example Comments for Code Digger in VS Gallery cont. • “What a fantastictool. Whilst it’s not bullet proof, it shows amazingpromise. I ran the Code Digger over a number of real-world methods and it immediately identified dozens of edge cases we hadn’t thought of. This is getting rolled-out to my team TODAY! Well done. Brilliant. Really brilliant.” • “Topstuff here. Very anxiousfor more of the Pexfeatures that were available in VS 2010 Pex & Moles (like auto-gen unit tests). This tool is poised to become indispensablefor anyone writing solid suites of unit tests.”
Pex4Fun http://pex4fun.com/ 1,462,489 clicked 'Ask Pex!'
Behind the Scene of Pex4Fun behavior Secret Impl== Player Impl Player Implementation class Player { public static int Puzzle(int x) { return x; } } Secret Implementation class Secret { public static int Puzzle(int x) { if (x <= 0) return 1; return x * Puzzle(x-1); } } class Test { public static void Driver(int x) { if (Secret.Puzzle(x) != Player.Puzzle(x)) throw new Exception(“Mismatch”); } }
Example User Feedback on Pex4Fun “I used to love the first person shooters and the satisfaction of blowing away a whole team of Noobies playing Rainbow Six, but this is far more fun.” X “I’m afraid I’ll have to constrain myselfto spend just an hour or so a day on this really exciting stuff, as I’m really stuffed with work.” “It really got me *excited*. The part that got me most is about spreading interest in teaching CS: I do think that it’s REALLY great for teaching | learning!”
Code Hunt: Redesigned as Game https://www.codehunt.com/
ICFP Programming Contest 2013 • August 8 – 11, 2013 • 300+ teams wrote tools to synthesize bit-vector programs • These tools were evaluated on a set of 1,800 benchmark problems • Main goal: • How would the top-teams fare against the best SMT solutions? http://research.microsoft.com/~nswamy/icfpc.pptx http://research.microsoft.com/en-us/events/icfpcontest2013/
ICFP Programming Contest 2013 as Program Synthesis Game ICFP Programming Contest 2013 Ah ha! I guess A = λx. if x & 1 = 0 then x else x + 1 Ah. I bet A = λx. x+1 Hmm. Ok, so what is A(11) and A(12) then? Can you tell me what A(16), A(42), A(128) are? Let me check … Let me check … Yep! That's right! You score one point. A(16)=17, A(42)=43, A(128)=129. Nope. A(9)=9. I have a secret program A. Can you guess what it is? You have 5 minutes. Since you ask so nicely: A(11)=12 and A(12)=13 PLAYER GAME query.smt2 A ≈λx. x+1 ? query.smt2 A ≈λx. if x&1=0…? No! Counterexample: A(9) <> (λx.x+1) 9 Yes! http://research.microsoft.com/~nswamy/icfpc.pptx http://research.microsoft.com/en-us/events/icfpcontest2013/
What Lie Behind Pex • NOT Random: • Cheap, Fast • “It passed a thousand tests” feeling • … • But Dynamic Symbolic Execution: e.g., Pex, CUTE,EXE • White box • Constraint Solving
Dynamic Symbolic Execution Choose next path • Code to generate inputs for: Solve Execute&Monitor void CoverMe(int[] a) { if (a == null) return; if (a.Length > 0) if (a[0] == 1234567890) throw new Exception("bug"); } Negated condition a==null F T a.Length>0 T F Done: There is no path left. a[0]==123… F T Data null {} {0} {123…} Observed constraints a==null a!=null && !(a.Length>0) a!=null && a.Length>0 && a[0]!=1234567890 a!=null && a.Length>0 && a[0]==1234567890 Constraints to solve a!=null a!=null && a.Length>0 a!=null && a.Length>0 && a[0]==1234567890
Explosion of Search Space There are decision procedures for individual path conditions, but… • Number of potential paths grows exponentially with number of branches • Reachable code not known initially • Without guidance, same loop might be unfolded forever Fitnex search strategy [Xie et al. DSN 09] http://research.microsoft.com/apps/pubs/default.aspx?id=81089
DSE Example TestLoop(0, {0}) public boolTestLoop(int x, int[] y) { if (x == 90) { for (int i = 0; i < y.Length; i++) if (y[i] == 15) x++; if (x == 110) return true; } return false; } Path condition: !(x == 90) ↓ New path condition: (x == 90) ↓ New test input: TestLoop(90, {0})
DSE Example TestLoop(90, {0}) public boolTestLoop(int x, int[] y) { if (x == 90) { for (int i = 0; i < y.Length; i++) if (y[i] == 15) x++; if (x == 110) return true; } return false; } Path condition: (x == 90) && !(y[0] ==15) ↓ New path condition: (x == 90) && (y[0] ==15) ↓ New test input: TestLoop(90, {15})
Challenge in DSE TestLoop(90, {15}) public boolTestLoop(int x, int[] y) { if (x == 90) { for (int i = 0; i < y.Length; i++) if (y[i] == 15) x++; if (x == 110) return true; } return false; } Path condition: (x == 90) && (y[0] ==15) && !(x+1 == 110) ↓ New path condition: (x == 90) && (y[0] ==15) && (x+1 == 110) ↓ New test input: No solution!?
A Closer Look TestLoop(90, {15}) public boolTestLoop(int x, int[] y) { if (x == 90) { for (int i = 0; i < y.Length; i++) if (y[i] == 15) x++; if (x == 110) return true; } return false; } Path condition: (x == 90) && (y[0] ==15) && (0 < y.Length) && !(1 < y.Length) && !(x+1 == 110) ↓ New path condition: (x == 90) && (y[0] ==15) && (0 < y.Length) && (1 < y.Length) Expand array size
A Closer Look TestLoop(90, {15}) public boolTestLoop(int x, int[] y) { if (x == 90) { for (int i = 0; i < y.Length; i++) if (y[i] == 15) x++; if (x == 110) return true; } return false; } We can have infinite paths! Manual analysis need at least 20 loop iterations to cover the target branch Exploring all paths up to 20 loop iterations is infeasible: 220 paths
Fitnex: Fitness-Guided Exploration [Xie et al. DSN 2009] TestLoop(90, {15, 0}) TestLoop(90, {15, 15}) public boolTestLoop(int x, int[] y) { if (x == 90) { for (int i = 0; i < y.Length; i++) if (y[i] == 15) x++; if (x == 110) return true; } return false; } Key observations: with respect to the coverage target • not all paths are equally promising for branch-node flipping • not all branch nodes are equally promising to flip • Our solution: • Prefer to flip branch nodes on the most promising paths • Prefer to flip the most promising branch nodes on paths • Fitness function to measure “promising” extents
Fitness Function • FF computes fitness value (distance between the current state and the goal state) • Search tries to minimize fitness value [Tracey et al. 98, Liu at al. 05, …]
Fitness Function for (x == 110) public boolTestLoop(int x, int[] y) { if (x == 90) { for (int i = 0; i < y.Length; i++) if (y[i] == 15) x++; if (x == 110) return true; } return false; } Fitness function: |110 – x |
Compute Fitness Values for Paths Fitness Value public boolTestLoop(int x, int[] y) { if (x == 90) { for (int i = 0; i < y.Length; i++) if (y[i] == 15) x++; if (x == 110) return true; } return false; } (x, y) (90, {0}) 20 (90, {15}) 19 (90, {15, 0}) 19 (90, {15, 15}) 18 (90, {15, 15, 0}) 18 (90, {15, 15, 15}) 17 (90, {15, 15, 15, 0}) 17 (90, {15, 15, 15, 15}) 16 (90, {15, 15, 15, 15, 0}) 16 (90, {15, 15, 15, 15, 15}) 15 … Fitness function: |110 – x | Give preference to flip paths with better fitness values We still need to address which branch node to flip on paths …
Compute Fitness Gains for Branches Fitness Value public boolTestLoop(int x, int[] y) { if (x == 90) { for (int i = 0; i < y.Length; i++) if (y[i] == 15) x++; if (x == 110) return true; } return false; } (x, y) (90, {0}) 20 (90, {15}) flip b4 19 (90, {15, 0}) flip b2 19 (90, {15, 15}) flip b4 18 (90, {15, 15, 0}) flip b2 18 (90, {15, 15, 15}) flip b4 17 (90, {15, 15, 15, 0}) flip b2 17 (90, {15, 15, 15, 15}) flip b4 16 (90, {15, 15, 15, 15, 0}) flip b2 16 (90, {15, 15, 15, 15, 15}) flip b4 15 … Fitness function: |110 – x | Branch b1: i < y.Length Branch b2: i >= y.Length Branch b3: y[i] == 15 Branch b4: y[i] != 15 • Flipping Branch b4 (b3) gives us average 1 (-1) fitness gain (loss) • Flipping branch b2 (b1) gives us average 0 fitness gain (loss)
Compute Fitness Gain for Branches cont. • For a flipped node leading to Fnew, find out the old fitness value Fold before flipping • Assign Fitness Gain (Fold – Fnew)for the branch of the flipped node • Assign Fitness Gain (Fnew – Fold )for the other branch of the branch of the flipped node • Compute the average fitness gain for each branch over time
Search Frontier • Each branch node candidate for being flipped is prioritized based on its composite fitness value: • (Fitness value of node – Fitness gain of its branch) • Select first the one with the best composite fitness value http://research.microsoft.com/apps/pubs/default.aspx?id=81089
Successful Case of MSR Testing Tool: Pex & Relatives • Pex (released on May 2008): • 30,388download# (20 months, Feb 08-Oct 09) • Active user community: 1,436 forum posts during ~3 years (Oct 08- Nov 11) • Moles (released Sept 2009) • Shippedwith VS 12 as Fakes • “Provide Microsoft Fakes w/ all Visual Studio editions” got 1,457community votes • Code Digger (released on Oct 2008 for VS 08/10, on Apr 2013 in VS Gallery for VS 12) • 22,466 download# (10 months, Apr 13-Jan 14) How to make such successful case????
Lesson 1. Started as (Evolved) Dream Moles/Fakes • Surrounding(Moles/Fakes) • Simplifying (Code Digger) • Retargeting (Pex4Fun/Code Hunt) Parameterized Unit Tests Supported by Pex Code Digger void TestAdd(ArrayList a, object o) { Assume.IsTrue(a!=null); inti = a.Count; a.Add(o); Assert.IsTrue(a[i] == o); } Pex4Fun/Code Hunt
Lesson 2. Chicken and Egg Macro Perspective • Developer/manager: “Who is using your tool?” • Pexteam: “Do you want to be the first?” • Developer/manager: “I love your tool but no.” Tool Adoption by (Mass) Target Users Tool Shipping with Visual Studio Micro Perspective
Lesson 3. Human Factors – Generated Data Consumed by Human • Developer: “Code digger generates a lot of “\0” strings as input. I can’t find a way to create such a string via my own C# code. Could any one show me a C# snippet? I meant zero terminated string.” • Pex team: “In C#, a \0 in a string does not mean zero-termination. It’s just yet another character in the string (a very simple character where all bits are zero), and you can create as Pex shows the value: “\0”.” • Developer: “Your tool generated “\0”” • Pexteam: “What did you expect?” • Developer: “Marc.”
Lesson 3. Human Factors – Generated Name Consumed by Human • Developer: “Your tool generated a test called Foo001. I don’t like it.” • Pexteam: “What did you expect?” • Developer:“Foo_Should_Fail_When_Bar_Is_Negative.”
Lesson 3. Human Factors – Generated Results Consumed by Human Object Creation messages suppressed (related to Covana by Xiao et al. [ICSE’11]) Exception Tree View Exploration Tree View Exploration Results View
Lesson 4. Best vs. Worst Cases public boolTestLoop(int x, int[] y) { if (x == 90) { for (int i = 0; i < y.Length; i++) if (y[i] == 15) x++; if (x == 110) return true; } return false; } http://research.microsoft.com/apps/pubs/default.aspx?id=81089 Fitnexby Xie et al. [DSN’09] Key observations: with respect to the coverage target • not all paths are equally promising for branch-node flipping • not all branch nodes are equally promising to flip To avoid local optimal or biases, the fitness-guided strategy is integratedwith Pex’sfairness search strategies • Our solution: • Prefer to flip branch nodes on the most promising paths • Prefer to flip the most promising branch nodes on paths • Fitness function to measure “promising” extents
Lesson 5. Tool Users’ Stereotypical Mindset or Habits • “Simply one mouse click and then everything would work just perfectly” • Often need environment isolation w/ Moles/Fakes or factory methods, … • “One mouse click, a test generation tool would detect all or most kinds of faults in the code under test” • Developer: “Your tool only finds null references.” • Pex team: “Did you write any assertions?” • Developer: “Assertion???” • “I do not need test generation; I already practice unit testing (and/or TDD). Test generation does not fit into the TDD process”
Lesson 6. Practitioners’ Voice Gathered feedback from target tool users • Directly, e.g., via • MSDN Pex forum, tech support, outreach to MS engineers and .NET user groups • Indirectly, e.g., via • interactions with MS Visual Studio team (a tool vendor to its huge user base) • Motivations of Moles • Refactoring testability issue faced resistance in practice • Observation at Agile 2008: high attention on mock objects and tool supports
Lesson 7. Collaboration w/ Academia • Win-win collaboration model • Win (Ind Lab): longer-term research innovation, man power, research impacts, … • Win (Univ): powerful infrastructure, relevant/important problems in practice, both research and industry impacts, … • Industry-located Collaborations • Faculty visits, e.g., Fitnex, Pex4Fun • Student internships, e.g., FloPSy, DyGen, state cov • Academia-located Collaborations http://research.microsoft.com/en-us/projects/pex/community.aspx#publications
Lesson 7. Collaboration w/ Academia Academia-located Collaborations • Immediate indirect impacts, e.g., • Reggae [ASE’09s] Rex • MSeqGen[FSE’09] DyGen • Guided Cov [ICSM’10] state coverage • Long-term indirect impacts, e.g., • DySy by Csallneret al. [ICSE’08] • Seeker [OOPSLA’11] • Covana [ICSE’11] http://research.microsoft.com/en-us/projects/pex/community.aspx#publications
Summary • Pex practice impacts • Moles/Fakes, Code Digger, Pex4Fun/Code Hunt • Lessons in transferring tools • Started as (Evolved) Dream • Chicken and Egg • Human Factors • Best vs. Worst Cases • Tool Users’ Stereotypical Mindset or Habits • Practitioners’ Voice • Collaboration w/ Academia
Thank you http://research.microsoft.com/pex https://sites.google.com/site/asergrp/
Summary • Pex practice impacts • Moles/Fakes, Code Digger, Pex4Fun/Code Hunt • Lessons in transferring tools • Started as (Evolved) Dream • Chicken and Egg • Human Factors • Best vs. Worst Cases • Tool Users’ Stereotypical Mindset or Habits • Practitioners’ Voice • Collaboration w/ Academia