1 / 42

Skoll: A System for Distributed Continuous Quality Assurance

Skoll is a system designed to tackle the challenges of quality assurance in large-scale, complex software systems. It enables distributed and continuous QA processes across geographically-distributed teams, allowing for massive parallelization and improved access to resources. Skoll's infrastructure and approach provide a coordinated and efficient way to achieve comprehensive QA in systems with numerous configuration options and evolving requirements.

lbrenton
Download Presentation

Skoll: A System for Distributed Continuous Quality Assurance

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Skoll: A System for Distributed Continuous Quality Assurance Atif Memon & Adam Porter University of Maryland {atif,aporter}@cs.umd.edu

  2. Quality Assurance for Large-Scale Systems • Modern systems increasingly complex • Run on numerous platform, compiler & library combinations • Have 10’s, 100’s, even 1000’s of configuration options • Are evolved incrementally by geographically-distributed teams • Run atop of other frequently changing systems • Have multi-faceted quality objectives • How do you QA systems like this? http://www.cs.umd.edu/{projects/skoll,~atif/GUITARWeb/}

  3. Distributed Continuous Quality Assurance • QA processes conducted around-the-world, around-the-clock on powerful, virtual computing grids • Grids can by made up of end-user machines, project-wide resources or dedicated computing clusters • General Approach • Divide QA processes into numerous tasks • Intelligently distribute tasks to clients who then execute them • Merge and analyze incremental results to efficiently complete desired QA process • Expected benefits • Massive parallelization allows more, better & faster QA • Improved access to resources/environs. not readily found in-house • Carefully coordinated QA efforts enables more sophisticated analyses http://www.cs.umd.edu/{projects/skoll,~atif/GUITARWeb/}

  4. Group Collaborators Doug Schmidt & Andy Gohkale Alex Orso Myra Cohen Murali Haran, Alan Karr, Mike Last, & Ashish Sanil Sandro Fouché, Alan Sussman, Cemal Yilmaz (now at IBM TJ Watson) & Il-Chul Yoon http://www.cs.umd.edu/{projects/skoll,~atif/GUITARWeb/}

  5. Skoll DCQA Infrastructure & Approach Clients See: A. Porter, C. Yilmaz, A. Memon, A. Nagarajan, D. C. Schmidt, and B. Natarajan, Skoll: A Process and Infrastructure for Distributed Continuous Quality Assurance. IEEE Transactions on Software Engineering. August 2007, 33(8), pp. 510-525. http://www.cs.umd.edu/{projects/skoll,~atif/GUITARWeb/}

  6. Skoll DCQA Infrastructure & Approach Clients 1. Model See: A. Porter, C. Yilmaz, A. Memon, A. Nagarajan, D. C. Schmidt, and B. Natarajan, Skoll: A Process and Infrastructure for Distributed Continuous Quality Assurance. IEEE Transactions on Software Engineering. August 2007, 33(8), pp. 510-525. http://www.cs.umd.edu/{projects/skoll,~atif/GUITARWeb/}

  7. Skoll DCQA Infrastructure & Approach Clients 2. Reduce Model See: A. Porter, C. Yilmaz, A. Memon, A. Nagarajan, D. C. Schmidt, and B. Natarajan, Skoll: A Process and Infrastructure for Distributed Continuous Quality Assurance. IEEE Transactions on Software Engineering. August 2007, 33(8), pp. 510-525. http://www.cs.umd.edu/{projects/skoll,~atif/GUITARWeb/}

  8. Test Request Test Resources Skoll DCQA Infrastructure & Approach Clients 3. Distribution See: A. Porter, C. Yilmaz, A. Memon, A. Nagarajan, D. C. Schmidt, and B. Natarajan, Skoll: A Process and Infrastructure for Distributed Continuous Quality Assurance. IEEE Transactions on Software Engineering. August 2007, 33(8), pp. 510-525. http://www.cs.umd.edu/{projects/skoll,~atif/GUITARWeb/}

  9. Test Results Skoll DCQA Infrastructure & Approach Clients 4. Feedback See: A. Porter, C. Yilmaz, A. Memon, A. Nagarajan, D. C. Schmidt, and B. Natarajan, Skoll: A Process and Infrastructure for Distributed Continuous Quality Assurance. IEEE Transactions on Software Engineering. August 2007, 33(8), pp. 510-525. http://www.cs.umd.edu/{projects/skoll,~atif/GUITARWeb/}

  10. Skoll DCQA Infrastructure & Approach Clients 5. Steering See: A. Porter, C. Yilmaz, A. Memon, A. Nagarajan, D. C. Schmidt, and B. Natarajan, Skoll: A Process and Infrastructure for Distributed Continuous Quality Assurance. IEEE Transactions on Software Engineering. August 2007, 33(8), pp. 510-525. http://www.cs.umd.edu/{projects/skoll,~atif/GUITARWeb/}

  11. The ACE+TAO+CIAO (ATC) System • ATC characteristics • 2M+ line open-source CORBA implementation • maintained by 40+, geographically-distributed developers • 20,000+ users worldwide • Product Line Architecture with 500+ configuration options • runs on dozens of OS and compiler combinations • Continuously evolving – 200+ CVS commits per week • Quality concerns include correctness, QoS, footprint, compilation time & more http://www.cs.umd.edu/{projects/skoll,~atif/GUITARWeb/}

  12. Define QA Space http://www.cs.umd.edu/{projects/skoll,~atif/GUITARWeb/}

  13. Nearest Neighbor Search http://www.cs.umd.edu/{projects/skoll,~atif/GUITARWeb/}

  14. Nearest Neighbor Search http://www.cs.umd.edu/{projects/skoll,~atif/GUITARWeb/}

  15. Nearest Neighbor Search http://www.cs.umd.edu/{projects/skoll,~atif/GUITARWeb/}

  16. Nearest Neighbor Search http://www.cs.umd.edu/{projects/skoll,~atif/GUITARWeb/}

  17. CORBA_MESSAGING | 1 0 AMI_POLLER AMI 1 0 1 0 AMI_CALLBACK OK ERR-1 ERR-3 1 0 OK ERR-2 Fault Characterization • We used machine learning techniques (classification trees) to model option & setting patterns that predict test failures http://www.cs.umd.edu/{projects/skoll,~atif/GUITARWeb/}

  18. Applications & Feasibility Studies • Compatibility testing of component-based systems • Configuration-level fault characterization • Test case generation & input space exploration http://www.cs.umd.edu/{projects/skoll,~atif/GUITARWeb/}

  19. Compatibility Testing of Comp-Based Systems Goal • Given a component-based system, identify components & their specific versions that fail to build Solution Approach • Sample the configuration space, efficiently test this sample & identify subspaces in which compilation & installation fails • Initial focus on building & installing components. Later work will add functional and performance testing See: I. Yoon, A. Sussman, A. Memon & A. Porter, Direct-Dependency-based Software Compatibility Testing. International Conference on Automated Software Engineering, Nov. 2007 (to appear). http://www.cs.umd.edu/{projects/skoll,~atif/GUITARWeb/}

  20. The InterComm (IC) Framework • Middleware for coupling large scientific simulations • Built from up to 14 other components (e.g., PVM, MPI, GCC, OS) • Each comp can have several actively maintained versions • There are complex constraints between components, e.g., • Requires GCC version 2.96 or later • When configured with multiple GNU compilers, all must have the same version number • When configured with multiple comps that use MPI, all must use the same implementation & version • http://www.cs.umd.edu/projects/hpsl/chaos/ResearchAreas/ic • Developers need help to • Identify working/broken configurations • Broaden working set (to increase potential user base) • Rationally manage support activities http://www.cs.umd.edu/{projects/skoll,~atif/GUITARWeb/}

  21. Annotated Comp. Dependency Graph • ACDG = (CDG, Ann) • CDG: DAG capturing inter-comp deps • Ann: comp. versions & constraints • Constraints for each cfg, e.g., • ver (gf) = x ver (gcr) = x • ver (gf) = 4.1.1  ver (gmp) ≥ 4.0 • Can generate cfgs from ACDG • 3552 total cfgs. Takes up to ~10,700 CPU hrs to build all http://www.cs.umd.edu/{projects/skoll,~atif/GUITARWeb/}

  22. fc 4.0 gcr v4.0.3 gmp v4.2.1 pf/6.2 Improving Test Execution • Cfgs often share common build subsequences. This build effort should be reusable across cfgs • Combine all cfgs into a data structure called a prefix tree • Execute implied test plan across grid by (1) assigning subpaths to clients, (2) building each subcfg in a VM & caching the VMs to enable reuse • Example: with 8 machines each able to cache up to 8 VMs exhaustive testing takes up to 355 hours http://www.cs.umd.edu/{projects/skoll,~atif/GUITARWeb/}

  23. Direct-Dependency (DD) Coverage • Hypothesis: A comp’s build process is most likely to be affected by the comps on which it directly depends • A directly depends on B iff there is a path (in CDG) from A to B containing no comp nodes • Sampling approach • Identify all DDs between every pair of components • Identify all valid instantiations of these DDs (ver. combs that violate no constraints) • Select a (small) set of cfgs that cover all valid instantiations of the DDs http://www.cs.umd.edu/{projects/skoll,~atif/GUITARWeb/}

  24. Executing the DD Coverage Test Suite • DD test suite much smaller than exhaustive • 211 cfgs with 649 comps vs 3552 cfgs with 9919 comps • For IC, no loss of test eff. (same build failures exposed) • Speedups achieved using 8 machines w/ 8 VM cache • Actual case: 2.54 (18 vs 43 hrs) • Best case: 14.69 (52 vs 355 hrs) http://www.cs.umd.edu/{projects/skoll,~atif/GUITARWeb/}

  25. Summary • Infrastructure in place & working • Complete client/server implementation using VMware • Simulator for large scale tests on limited resources • Initial results promising, but lots of work remains • Ongoing activities • Alternative algorithms & test execution policies • More theoretical study of sampling & test exec approaches • Apply to more software systems http://www.cs.umd.edu/{projects/skoll,~atif/GUITARWeb/}

  26. Configuration-Level Fault Characterization Goal • Help developers localize configuration-related faults Current Solution Approach • Use covering arrays to sample the cfg space to test for subspaces in which (1) compilation fails or (2) reg. tests fail • Build models that characterize the configuration options and specific settings that define the failing subspace See: C. Yilmaz, M. Cohen, A. Porter, Covering Arrays for Efficient Fault Characterization in Complex Configuration Spaces, ISSTA’04, TSE v32 (1) http://www.cs.umd.edu/{projects/skoll,~atif/GUITARWeb/}

  27. Covering Arrays • Compute test schedule from t-way covering arrays • a set of configurations in which all ordered t-tuples of option settings appear at least once • 2-way covering array example: http://www.cs.umd.edu/{projects/skoll,~atif/GUITARWeb/}

  28. Limitations • Must choose the strength covering array before computing it • No way to know, a priori, what the right value is • Our experience suggests failures patterns can change over time • Choose too high: • Run more tests than necessary • Testing might not finish before next release • Non-uniform sample negatively affects classification performance • Choose too low: • Non-uniform sample negatively affects classification techniques • Must repeat process at higher strength http://www.cs.umd.edu/{projects/skoll,~atif/GUITARWeb/}

  29. Incremental Covering Arrays • Start with traditional covering array(s) of low strength (usually 2) • Execute test schedule & classify observed failures • If resources allow or classification performance requires • Increment strength • Build new covering array using previously run array(s) as seeds See: S. Fouche, M. Cohen and A. Porter. Towards Incremental Adaptive Covering Arrays, ESEC/FSE 2007, (to appear) http://www.cs.umd.edu/{projects/skoll,~atif/GUITARWeb/}

  30. Incremental Covering Arrays (cont.) Multiple CAs at each level of t • Use t1 as a seed for the first t+1-way array (t+11) • To create the itht-way array (ti), create a seed of size = |t-11| using non-seeded cfgs from t-1i • If |seed| < |t-11| complete seed with cfgs from t+11 http://www.cs.umd.edu/{projects/skoll,~atif/GUITARWeb/}

  31. MySQL Case Study • Project background • Widely-used, 2M+ line open-source database project • Cont. evolving & maint. by geographically-distributed developers • Dozens of cfg opts & runs on dozens of OS/compiler combos • Case study using release 5.0.24 • Used 13 cfg opts with 2-12 settings each (> 110k unique cfgs) • 460 tests/per config across a grid of 50 machines • Executed ~50M tests total using ~25 CPU years http://www.cs.umd.edu/{projects/skoll,~atif/GUITARWeb/}

  32. Results • Built 3 trad and incr covering arrays for 2  t  4 • Traditional sizes : 108, 324, 870 • Incremental sizes: 113, 336 (223), 932 (596) • Incr appr exposed & classified the same failures as trad appr • Costs depend on t & failures patterns • Failures at level t : Inc > Trad (4-9%) • Failures at level < t : Inc < Trad (65-87%) • Failures at level > t : Inc < Trad (28-38%) http://www.cs.umd.edu/{projects/skoll,~atif/GUITARWeb/}

  33. Summary • New application driving infrastructure improvements • Initial results encouraging • Applied process to a configuration space with over 110K cfgs • Found many test failures corresponding to real bugs • Incremental approach more flexible than traditional approach. Appears to offer substantial savings in best case, while incurring minimal cost in worst case • Ongoing extensions • MySQL continuous build process • Community involvement starting • Want to volunteer? Go to http://www.cs.umd.edu http://www.cs.umd.edu/{projects/skoll,~atif/GUITARWeb/}

  34. JFCUnit Other interactions Exponential with length Capture/replay Tedious Test “common” sequences Bad Idea Model-based techniques GUITAR guitar.cs.umd.edu GUI Test Case – Executable By A “Robot” http://www.cs.umd.edu/{projects/skoll,~atif/GUITARWeb/}

  35. Sampling Follows Follows Follows Follows Modeling The Event-Interaction Space • Event flow graph (EFG) • Nodes: all GUI events • Starting events • Edges: Follows • relationship • Reverse Engineering • Obtained Automatically • Test case generation • Cover all edges See: Atif M. Memon and Qing Xie, Studying the Fault-Detection Effectiveness of GUI Test Cases for Rapidly Evolving Software. IEEE Transactions on Software Engineering, vol. 31, no. 10, 2005, pp. 884-896. http://www.cs.umd.edu/{projects/skoll,~atif/GUITARWeb/}

  36. Point to the CVS head Push the button Read error report What happens Gets code from CVS head Builds Reverse engineers the event-flow graph Generates test cases to cover all the edges 2-way covering Runs them SourceForge.net Four applications Lets See How It Works! http://www.cs.umd.edu/{projects/skoll,~atif/GUITARWeb/}

  37. Intuition Non-interacting events (e.g., Save, Find) Interacting events (e.g., Copy, Paste) Key Idea Identify interacting events Mark the EFG edges (Annotated graph) Generate 3-way, 4-way, … covering test cases for interacting events only Digging Deeper! EFG http://www.cs.umd.edu/{projects/skoll,~atif/GUITARWeb/}

  38. Identifying Interacting Events • High-level overview of approach • Observe how events execute on the GUI • Events interact if they influence one another’s execution • Execute event e2; execute event sequence <e1, e2> • Did e1 influence e2’s execution? • If YES, then they must be tested further; annotate the <e1, e2> edge in graph • Use feedback • Generate seed suite • 2-way covering test cases • Run test cases • Need to obtain sets of GUI states • Collect GUI run-time states as feedback • Analyze feedback and obtain interacting event sets • Generate new test cases • 3-way, 4-way, … covering test cases http://www.cs.umd.edu/{projects/skoll,~atif/GUITARWeb/}

  39. Did We Do Better? • Compare feedback-based approach to 2-way coverage http://www.cs.umd.edu/{projects/skoll,~atif/GUITARWeb/}

  40. Summary • Manually developed test cases • JFCUnit, Capture/replay • Can be deployed and executed by a “robot” • Too many interactions to test • Exponential • The GUITAR Approach • Develop a model of all possible interactions • Use abstraction techniques to “sample” the model • Develop adequacy criteria • Generate an “initial test suite”; Develop an “Execute tests – collect feedback – annotate model – generate tests” cycle • Feasibility study & results http://www.cs.umd.edu/{projects/skoll,~atif/GUITARWeb/}

  41. Future Work • Need volunteers for MySQL Build Farm Project • http://skoll.cs.umd.edu • Looking for more example systems (help!) • Continue improving Skoll system • New problem classes • Performance and robustness optimization • Improved use of test data • Test case ROI analysis • Configuration advice • Cost-aware testing (e.g., minimize power, network, disk) • Use source code analysis to further reduce state spaces • Extend test generation technology outside GUI applications • QA for distributed systems http://www.cs.umd.edu/{projects/skoll,~atif/GUITARWeb/}

  42. The End

More Related