8120 Programming 2: Testing and Debugging 2 & 3

8120 Programming 2: Testing and Debugging 2 & 3 Dr Mike Brayshaw

Testing Programs from SE Perspective: Requirements • How do you test requirements • In terms of raw input output (functional) • What are the real requirements • Usability • Verification and Validation

Testing Programs from SE Perspective: Rest of Lifecycle • Specification: what criteria need to be met? • Design: how to we want the thing built? • Coding: • Implementation • Performance • verification and validation • Maintenance: updates and patches

Testing your code • There may be high level descriptions of how your program may perform often derived from requirements and specification • The Pragmatics…….. • We must choose some test cases, how do we go about that? • Boundaries, limits, extreme cases, common cases……..

Basic Program Testing • Basic block testing – contains no branches • Alternative Paths through code block1 if (boolean1) while (boolean2) block2; block3; else block4;

Concept of Statement Coverage • Test data should ensure that each statement in a program is executed as a test • This in effect equate to block coverage

Concept of Branch Coverage (aka decision coverage) • Test both sides of a Boolean • Branch multiple condition coverage - test all sides of a branching conditional if (x > 0) and (y > 0) block1; else block2 full coverage is provided by 4 cases x =1, y =1, x =0, and y=0 All paths if (x > 0) block1 else block2 if (y > 0) block3 else block4

Instrumenting Your Code • Placing write statements inside your code • printf - C, console.write - Csharp, system.out.print - Java, format - Fortran/LISP if (x > 0) and (y > 0) printf(“x and y > 0”) block1; else printf(“else condition”) block2 • Print a story of the program’s execution

Testing by hand • Already talked about test suites • Tracers and Debuggers • Code debuggers • Steppers and Tracers • Visualisation Systems • Automatic Testing • From Code Specification or other high level description • From Data Flow Analysis- e.g. Programmers Apprentice (Rich and Waters) • Program Synthasis

Types of Automatic Analysers • Static Analysers • Code Auditors • Test File/Test Suite Programs • Test Data Generators • Test Harnesses • install candidate program • feed it input • simulate the i/o behaviour stubs of subordinate modules

Software Testing Strategies • Who does the testing? • The developers (are they impartial? Are they too close to the system?) • An Independent Test Group (ITG)? • Or a combination of the two

Top Down Testing • integration testing, does everything that works independently work together • do we need the full system before we can test? • Use of stubs • Simulate i/o behaviour of yet to be written modules e.g. by look up table • Allows holistic look view of overall system prior to full implementation • We see from a perspective how all the parts fit together

Bottom Up Testing • We test the raw component parts • Then we test them in combination with other basic building blocks • When they are working we test next level up proceeding recursively till we reach top level • Good pragmatic approach but we lack global vision till we get to the top • Program as a final entity doesn’t exist till the very end • Hyprid approach (aka sandwich testing)

Verification and Validation (again) • Verification “Are we building the product right” • Validation “Are we building the right project”

Testing the requirements – validation testing • What’s in the Software Requirements Document • Acceptance Tests • Alpha and Beta Testing

Testing the systems engineering – systems testing • Recovery Testing – can it all over elegantly and mend itself • Security Testing • Stress Testing • Performance Testing

The Art of Debugging • More of an intuitive art than hard science • Different to Software Testing • testing we can identify a systematic process • debugging more problematic

Why is it a problem • Cause Effect Chasm – so called geography problem • The Error may disappear • The Error might be a non-error (e.g. due to rounding issues) • May be hard to trace human error • Result of Timing • Difficult to Reproduce • Be Intermittent (particularly when some hardware is involved with the software component)

Psychological Issues • The last place to show our skills as hunter/gatherers! • We make mistakes regularly • Logic Mistakes vs Coding Mistakes • is it our algorithm • is it our coding of the algorithm • which language am I using? • Human Memory

Debugging Approaches • Combination of systematic evaluation, intuition, and luck • Brute Force Approach: generate as much info as possible and you’ll see the problem • Backtracking: working backwards from manifestation to cause • Cause Elimination: Systematic go through what could be causing the problem • Co-Rewriting or simplification

Elimination • Beware the your fix doesn’t introduce more problems • does the problem occur elsewhere in the program • what are the knock on effects to the fix I have just made • what could have been done to prevent this problem in the first place?

A Systematic Study of Bugs • Eisenstadt, Marc, “My Hairiest bug” war stories, Comms ACM, 1992 • An attempt to find out what it’s like “out the in the trenches” • The trawl on various bulletin boards

c.language/tools #2842, from meisenstadt 771 chars, Tue Mar 3 • I'm looking for some (serious) anecdotes describing debugging experiences. In particular, I want to know about particularly thorny bugs in LARGE pieces of software which caused you lots of headaches. It would be handy if the large piece of software were written in C or C++, but this is not absolutely essential. I'd like to know how you cracked the problem-- what techniques/tools you used: did you 'home in' on the bug systematically, did the solution suddenly come to you in your sleep, etc.

Example: Story B • ...I once had a program that only worked properly on Wednesdays...The documentation claimed that the day of the week was returned in a doubleword, 8 bytes. In actual fact, Wednesday is 9 characters long, and the system routine actually expected 12 bytes of space to put the day of the week. Since I was supplying only 8 bytes, it was writing 4 bytes on top of storage area intended for another purpose. As it turned out, that space was where a "y" was supposed to be stored to compare to the users answer. Six days a week the system would wipe out the "y" with blanks, but on Wednesdays a "y" would be stored in its correct place.

[Story C, excerpt] • ...The program only crashed after running about 45000 iterations of the main simulation loop... Somewhere, somehow, someone was walking over memory. But that somewhere could have been *anywhere* - writing in one of the many global arrays, for example....The bug turned out to be a case of an array of shorts (max value 32k) that was having certain elements incremented every time they were "used", the fastest use being about every 1.5 iterations of the simulator. So an element of an array would be incremented past 32k, back down to -32k. This value was then used as an array index. ....But of course the actual seg fault was happening several iterations after the error - the bogus write into memory. It took 3 hours for the program to crash, so creating test cases took forever. I couldn't use any of the heavier powered debugging malloc()s, or use watchpoints, because those slow a program down at least 10 fold, resulting in 30 hours to track a bug.

Why is it so difficult? • Cause effect chasm • Tools inapplicable or hampered – “Hiesenbugs” • WYSIPIG (What you see is probably illusory, guv’nor) – programmers just misreading what as there • faulty models or mis-directed blame • spaghetti code (usually written by “somebody else”)

How Bugs are Found • Gather Data • step and study • wrap and profile • print and peruse – instrument you code with print statements • dump and diff • conditional break and inspectspecialist profile tools (e.g. to spot memory leaks) • Inspeculation (= inspection + simulation + speculation) • Expert Recognised Cliché • Controlled Experiments

Then Root Causes • mem: memory clobbered -can’t happen in managed memoury systems lie Java/Csharp • vendor: problems with what is supplied) • algorithm wrong • erroneous initialisation (e.g. incorrect initialisation of var) • wrong variable used • lexical problem, ranging from typo to not parsing syntax correctly • Langugage semantic ambiguous or misunderstoode • unsolved

In Conclusion • Be systematic • Identify likely causes • Review recent changes • Don’t just make changes and hack and hope • Don’t be superstitious (the wearing yellow socks approach) • Choose and use tools appropriately • Be Lucky

8120 Programming 2: Testing and Debugging 2 & 3