1 / 49

Testing and Validating SAS Programs

Testing and Validating SAS Programs. Neil Howard i3 Data Services. Raymond Kennington. “Act in haste and repent in leisure; code too soon and debug forever.”. "Any sufficiently advanced bug is indistinguishable from a feature." Kilawiec.

noe
Download Presentation

Testing and Validating SAS Programs

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Testing and Validating SAS Programs Neil Howard i3 Data Services

  2. Raymond Kennington • “Act in haste and repent in leisure; code too soon and debug forever.”

  3. "Any sufficiently advanced bug is indistinguishable from a feature." Kilawiec Objectives • Define: • Debugging • Testing • Verification • Validation • How to program on purpose

  4. Gilb's Law of Unreliability: Undetectable errors are infinite in variety, in contrast to detectable errors which by definition are limited. IEEE definitions • Debugging - detect, locate and correct faults in system • Verification - ensure product fulfills requirements during development cycle

  5. OPPRESSION TESTING: Test this now!!! IEEE definitions • Testing - detect differences between existing and required conditions • Validation - evaluate end product to ensure compliance with software requirements

  6. AGGRESSION TESTING: If this doesn’t work, I’m gonna kill someone. Nomenclature • Debugging: finding and correcting root cause of unexpected results • Verification: checking results based on predetermined criteria

  7. OBSESSION TESTING: I’ll find this bug, even if it’s the last thing I do. Nomenclature • Testing: accepting or rejecting actual results by one or more testing types • Validation: documented evidence that system/program performs as expected

  8. Anything can be made to work if you fiddle with it long enough. Programming on purpose • Design • Deliberation • Simplicity • Style • Software Development Life Cycle (SDLC) • Testing/Validation Techniques

  9. Programming on Purpose • Efficiency • All types of resources • Including human capital • Readability • Maintainability • Reusability

  10. "It is easier to change the specification to fit the program than vice versa." Alan J. Perlis Basic SDLC • Requirements • Specifications • Systems developed • Final Acceptance • Production

  11. DIGRESSION TESTING: Well, it works fine, but let me tell you about my truck…. SOPs • Standard Operating Procedures • Guidelines • Checklists

  12. SUGGESTION TESTING: Well, it works, but wouldn’t it be better if….. Implications of Testing • 50-80% of total cost of development • cost over time of fixing errors • "writing for others" • can't kill legacy software • consequences of bad data

  13. Cost of Testing

  14. Meskiman's law: There's never time to do it right, but there's always time to do it over. Myths of testing • size matters • all elements of test plan are applicable regardless of size of program • depth, breadth, time may vary • testing is intended to show that software works • it’s a destructive act • to confuse, maim, break, crash, gag

  15. Testing – within the structure of the SDLC • Requirements • Specifications • Program/system development • Final Acceptance • Production

  16. The Last Law: If several things that could have gone wrong have not gone wrong, it would have been beneficial for them to have gone wrong. Requirements • WHAT: • overall goals of system • descriptions of input and output records, tables and reports • BEST TIME TO: • ask questions • walk-through requirements with users/clients

  17. CONFESSION TESTING: OK. OK. I did program that bug. Specifications • WHAT • detailed plan to satisfy requirements • physical design of processes • identify conditions for: • macros • testing • “what if” situations

  18. Steinbach's guideline for system's programming: Never test for an error condition you don't know how to handle. Test plan Program Development • coding program from specifications • debug • verification • testing • validation • documentation

  19. Debugging begins as soon as you see CONDITION CODE 0000. Validation • Debugging • Testing • Verification • Documentation

  20. Featherkile's Rule: Whatever you did, that's what you planned. Testing • Are we: • using appropriate test data • user supplied • generated • testing on an appropriate amount of data • obs= • random sample • collapse data into categories • using appropriate number of variables • DROP, KEEP

  21. Scott's Second Law: When an error is detected and corrected, it will be found to have been correct in the first place. What should we test? • conditions expected from input data • extreme values expected • range checking • number of observations • handling of missing values • all pathways through the code, to find: • logical dead-ends • infinite loops • code never executed • algorithms / mathematical calculations

  22. If it ain't broke, look closer. Types of Errors • Requirements • Specifications • Interfaces • Syntax/coding • Data • Logic/numeric

  23. Profanity is the one language programmers know best. Data and Basic Syntax/Coding Errors • errors or omissions in DATA step coding • often fatal • array subscript out of range • numeric over- or under-flow • uninitialization of variables • invalid data • hanging DOs or ENDs • invalid numeric operations • type conversions

  24. SECESSION TESTING: The bug is dead. Long live the bug. LOG inspection • warning and informational messages • generate notes in the LOG • tracks number of observations / variables • points to errors in: • DROP, KEEP • DELETE • BY • OUTPUT • MERGE • subsetting IFs

  25. It works, but it hasn’t been tested…... LOG enhancement • PUT statements • PUT _INFILE_ • PUT _ALL_ • PUTting your own error/tracing messages • ERROR statement • Examples: • PUT 'Started routine RANGE-EDIT'; • PUT 'Range error for sort key ' _INFILE_; • PUT 'After age calculation ' _ALL_; • ERROR 'Group and age do not match'; • _ERROR_ flag

  26. “Hey, it works on MY machine…….” Error File • for easier examination of errors • when data (or other) error is detected • create an error data set on the DATA statement • write the observation to the error file • include a character variable containing the error message • sort files by error type or ID number • or, create an error file for each error type • PROC PRINT of error data set(s)

  27. Zymurgy's First Law of Evolving Systems Dynamics: Once you open a can of worms, the only way to recan them is to use a larger can. Intermediate Results • data flow throughout the program • PROC PRINT before and after DATA steps • compare number of obs and variables • use OBS= to limit number of obs printed • use FIRSTOBS= to skip past a trouble spot • PROC CONTENTS • check number of obs and variables retained • check variable attributes • LENGTH !!!

  28. Osborn's Law: variables won't; constants aren't. Syntax checking • explained by compiler • check punctuation • syntax checking mode • compiles and executes subsequent steps

  29. A bug in the hand is better than one as yet undetected. Missing values • perfect data? • massaging data and intermediate checks uncover: • unexpected values • missing values • incorrect values • know how functions and PROCs handle missing values

  30. "In computing, the mean time to error keeps getting shorter." Alan J. Perlis Testing Considerations • "testing" code adds length to program and output • can introduce additional errors • when you remove it: • you will definitely get a change! • need to add it back

  31. Pierce's Law: When a compiler accepts a program without error on the first run, the program will not yield the desired output. Potential Solutions • conditional execution of "testing" codes and aids • IF-THEN structure for PUT statements • create a variable DEBUG • set at beginning of program • determines whether or not to execute PUTs

  32. "It's supposed to do that!!!!!" Anonymous help-line response 1982 A “debug” macro %IF &DEBUG = DEBUG %THEN %DO; PROC PRINT DATA=TEST(OBS=500); VAR A B C; RUN; %END; where DEBUG has been defined as a macro variable

  33. Troutman's Programming Postulate: Not until a program has been in production for at least six months will the most harmful error be discovered. Debug macro uses • conditional calculation of intermediate statistics • using PROC MEANS or SUMMARY, etc. • creation of generalized diagnostic macros for broad use

  34. If something goes wrong, it won't be the thing you expected. Modularity • generalized routines and diagnostics • simplify maintenance • changes made only in one place • guaranteed performance • standardized categorization • FORMAT libraries, table look-up • reduces probability of errors • more straightforward logical flow • code re-use

  35. “It’s never done that before……” Coding Conventions • group declarative statements • LENGTH, RETAIN, etc. • arrays definition • drop, keep, label • meaningful variable names • macro naming (%macro __debug) • use macros • comments • spacing and indentation style • use variable labels in PROCs

  36. “Somebody must have changed my code!!” Data Conversions • character-to-numeric • use INPUT function • x = input (y,8.); • numeric-to-character • use PUT function • a = put (b,8.); • avoid default conversions • e.g., don't pass character variable to numeric function

  37. Expression testing: #@$%^&*!!!, a bug!!! The Dreaded… • comment your SAS programs • COMMENT statement • * text.......... ; • /* text... */ • document your SAS data sets • PROC DATASETS • CONTENTS statement

  38. CONGRESSIONAL TESTING: Are you now, or have you ever, been a bug??? Comments • impact on testing and validation • re-use code • readability • maintainability • training • documentation • saves time in walkthroughs and debugging logic • code mimics specifications

  39. Manubay's Laws for Programmers: 1. If a programmer's modification of an existing program works, it's probably not what the user want; 2. Users don't know what they really want, but they know for certain what they don't want. Commenting • elements for each DATA, SQL, or PROC step • purpose • input • output • processing • calculations • manipulations • derivations

  40. "A program doesn't fail because you wear it out; it fails because it didn't work to begin with and you finally have occasion to notice that fact." Final Acceptance • Users ensure final product meets specifications • Needed: • test plan and supporting documentation • sample of output reports • final specifications or reference to location • results of code review

  41. A computer program does what you tell it to do, not what you want it to do. Production • final acceptance from all users • programs moved into production environment • can’t be modified • unless going through production change control process

  42. Wethern's Law of Suspended Judgment: Assumption is the mother of all screw-ups. Simplicity or “dumbing down” • Recent SAS-L discussion • Gets to the heart of programming on purpose

  43. Whatever goes wrong, there’s always someone who knew it would. Simplicity or “dumbing down” • PROBLEM: the boss changes a programmer’s code: • Too complicated • Junior programmer can’t understand it

  44. Any sufficiently advanced technology is indistinguishable from magic. Simplicity or “dumbing down” • Actions boss took: • Eliminate MERGEs • Too hard to test • Substitute IF-THEN-ELSE • Hardcode variable transformations

  45. Considerations: • Efficiency trade offs – how long to: • Understand • Debug • Correct • Maintain/change control • Will vary by site and situation

  46. Considerations: • What is the message the manager is sending about the abilities of the senior and junior programmers? • What opportunities is the manager missing for training/staff development? • Why not program on purpose? • To solve the problem • Purposefully and elegantly

  47. 2 + 2 = 5 for extremely large values of 2 Conclusions • Programming on purpose: • Designing solutions • Planning for testing/validation • Design in the context of the goals of your organization

  48. The easier it is to get it into a program, the harder it is to get out. When I am working on a problem I never think about beauty I only think about how to solve the problem. But when I have finished, If the solution is not beautiful I know it is wrong………….. R. Buckminster Fuller: engineer, designer, architect

  49. Thank you!! Neil Howard i3 Data Services neil.howard@i3data.com

More Related