330 likes | 428 Views
SSS Validation and Testing. September 11, 2003 Rockville, MD William McLendon Neil Pundit Erik DeBenedictis. Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy under contract DE-AC04-94AL85000. Overview.
E N D
SSS Validation and Testing September 11, 2003 Rockville, MD William McLendon Neil Pundit Erik DeBenedictis Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company,for the United States Department of Energy under contract DE-AC04-94AL85000.
Overview • APItest • Release Testing Experiences at Sandia • Status daemon
Distributed Runtime System Testing • Complex system of interactions • Approach to testing • Component Testing • Benchmarks • Performance / Functionality • Operational Profile • Stress Testing • Users expect a high-degree of quality in today’s high end systems!
APITEST - Overview • Unit-testing tool for network components • Targeted for networked applications • Extensible framework • Dependency calculus for inter-test relationships • Scriptable Tests (XML Schema Grammar) • Multi-Protocol Support • TCP/IP, SSSLib, Portals, HTTP
Accomplishments Since Last Meeting • Spent a week at Argonne (July) • Major rework of framework of APItest • Individual tests are atomic. • Framework handles the hard work of checking tests, dependencies, and aggregate results. • Extensibility • New test types are easy to create • Dependency System • Define relationships as a DAG encoded in XML. • Boolean dependencies on edges.
Supported Test Types • sssTest • use ssslib to communicate with ssslib enabled components • shellTest • execute a command • httpTest • ie. app testing web interfaces (a’la Globus, etc) • tcpipTest • raw socket via tcpip transmission.
Creating New Test Types is Easy A simple test that will always pass: class passTest(Test): __attrfields__=[‘name’] typemap = {‘dependencies’:TODependencies} def setup(self): pass def execute(self, scratch): self.expect[‘foo’] = [ (‘REGEXP’, ‘a’) ] self.response[‘foo’] = ‘a’
Matching and Aggregation • An individual test can be executed many times in a sequence. • PASS/FAIL can be determined based on the percent of runs that matched. • Percent Match can be specified as a range as well. • Expected result is specified as a regular expression (REGEX) or a string for exact matching (TXTSTR) • Notation: • M[min:max] - Percent matching. • min/max = bounds on % of tests where actual and expected results match. • If the actual number of tests is within the range specified the test will PASS, otherwise it will FAIL.
A B M[90:] M[100:] C M[:] ? Test Dependencies T iff A[40:90]OR B[0:0] T iff (A[100:] B[90:]) C[:] A B M[40:90] M[0:0] ? M[40:90] : >= 40% and <= 90% of test runs matched
An Example Dependency <dependencies> <AND> <dependency name=’A' minPctMatch=’100'/> <OR> <AND> <dependency name=’B' minPctMatch=’100'/> <dependency name=’C'/> </AND> <dependency name=’D’ maxPctMatch=‘0’/> </OR> </AND> </dependencies> (A[100:] ((B[100:100] C) D[:0]))
An Example Test Sequence test daemon M[:30] reset daemon M[:] M[:] test other stuff
Example Scripts • A simple shell execution test: <shellTest name=‘test1’ numReps=‘1’ preDelay=‘5’ postDelay=‘5.3’ command=‘ls -ltr’/> • Test with a dependency and stdout matching: <shellTest name=‘test2’ command=‘apitest.py --test’> <output format=‘REGEXP’ type=‘stdout’>.*stdout.*</output> <dependencies> <dependency name=‘test1’ minPctMatch=‘100.0’/> </dependencies></shellTest>
APItest Output iterations test name % matched Pass/Fail message ---------- --------- --------- --------- ---------- [1 of 1] A 100.00% PASS [1 of 1] K 100.00% FAIL m[0.0% : 0.0%] [1 of 1] J 0.00% FAIL m[90.0% : 90.0%] [5 of 5] M 100.00% PASS [1 of 1] L 100.00% FAIL m[0.0% : 0.0%] [1 of 1] N 100.00% PASS [0 of 1] T DEPENDENCY FAILURE(S) F expected [0.0% : 90.0%], got 100.0 J expected [90.0% : 100.0%], got 0.0 [0 of 1] R DEPENDENCY FAILURE(S) J expected [90.0% : 100.0%], got 0.0 K expected [0.0% : 90.0%], got 100.0 [1 of 1] S 100.00% PASS [1 of 1] U1 100.00% PASS [1 of 1] U2 100.00% PASS [0 of 1] U3 DEPENDENCY FAILURE(S) N expected [0.0% : 90.0%], got 100.0 S expected [0.0% : 90.0%], got 100.0 [1 of 1] U4 100.00% PASS
sssTest outputs from Chiba City iterations test name % matched Pass/Fail message ---------- --------- --------- --------- ---------- [1 of 1] add-location 100.00% PASS [1 of 1] QuerySDComps 100.00% PASS [1 of 1] QuerySDHost 100.00% PASS [1 of 1] QuerySDProtocol 100.00% PASS [1 of 1] QuerySDPort 100.00% PASS [1 of 1] del-location 100.00% PASS [1 of 1] val-removal 100.00% PASS iterations test name % matched Pass/Fail message ---------- --------- --------- --------- ---------- [1 of 1] sss-getproto 100.00% PASS [1 of 1] sss-getport 100.00% PASS [1 of 1] sss-gethost 100.00% PASS [1 of 1] sss-getcomp 100.00% PASS [1 of 1] sss-getproto 100.00% PASS [1 of 1] sss-getport 100.00% PASS [1 of 1] sss-gethost 100.00% PASS [1 of 1] sss-getcomp 100.00% PASS
Tales from Cplant Release Testing • Methodical execution of production jobs and 3rd Party benchmarks to identify system instabilities, enabling them to be resolved. Ie: • Rapid job turnover rate (caused mismatches between scheduler and allocator) • Heavy io (I/O which passes through launch node process instead of directly to ENFS “yod-io”) • Wrapping above codes into Ctest framework to enable portable compile, launch, and analysis of synthetic workloads
Ctest • Extension of Mike Carifio’s work • Presented at the SciDAC meeting in Houston during Fall of 2002 • Make structure that holds a suite of independent applications. • Tools to launch as a reproducible workload. • Goal: 30 users and 60 concurrent apps
Issue Tracking • SNL uses a program called RT • Centralized repository for issue tracking helps give an overall picture of what problems are. • Helps give summary of progress. • Bugzilla is on the SciDAC SSS website • http://bugzilla.mcs.anl.gov/scidac-sss/ • Who’s using it?
Status Daemon • Highly configurable monitoring infrastructure for clusters. • Does not need to run daemon on the node you are monitoring. • XML configurable • Web interface • “Cluster Aware” • Used on CPlant production clusters • James Laros (jhlaros@sandia.gov)
XML Config File Remote Tests Remote Tests Status Update Daemon Daemon XML Data XML Data Status Local Tests Local Tests Compute nodes Compute nodes Leader Leader Daemon Disk Local Tests Remote Tests AdminNode XML Data Status Daemon Communication
Summary • New Hire • Ron Oldfield • APItest functionality and flexibility increases • Release testing experience • Status Daemon
Plans • APItest • User / Programmer Manuals • User Interface • GUI? HTTP? • daemon mode for parallel testing mode • DB Connectivity • Test Development • ssslib event tests • HTTPtest work • ptlTest (SNL) • SWP Integration • Port SWP to Chiba for SC2003?