230 likes | 448 Views
CSE403 Software Engineering Autumn 2000 Fixing the Bugs. Gary Kimura Lecture #13 October 29, 2001. Today. Finish with finding the bugs How do we fix bugs? Prioritize bugs Tracking bugs and their fixes. Programming is simple, getting it to work is hard.
E N D
CSE403 Software EngineeringAutumn 2000Fixing the Bugs Gary Kimura Lecture #13 October 29, 2001
Today • Finish with finding the bugs • How do we fix bugs? • Prioritize bugs • Tracking bugs and their fixes
Programming is simple, getting it to work is hard • Whereas good programming practices and standards have gained some legitimacy and help popularize writing computer programs, debugging and fixing computer programs is still very much an art. There are a host of different “tricks” to use, but probably the best to learn to discover and fix bugs is to do it a lot and be around people who have done it a lot.
How do we fix bugs (and avoid bugs with defensive programming) • Besides being easy or hard to find some bugs are easy to fix and some are very hard to fix. • How hard a bug is to find does not necessarily relate with how easy or hard it is to fix. • Often fixing a bug raises other bugs because • a) The system can now run longer • b) The system behavior is changed with the previous fix • Bugs need to be prioritized. That is we sometimes need to choose bugs to fix and choose the bugs to ignore. Need to weigh the likelihood of the bug being encountered versus risk of the fix.
More on fixing bugs • Sometimes the only thing we can do is document and warn users about the bug. • Whether you should fix a bug is a large matter of risk analysis. The severity of the bug (both how bad it is and how likely it is to be encountered) must be weighed against the risk of doing the fix. • Some classic risks are: • Breaking something else • Slipping the schedule • The need to retest the system
Too close to shipping • The closer you get to wanting to ship the product the chooser you are about what bugs to fix. • Even the simplest fix can have unforeseen consequences. • For example, a simple fix to allow larger heaps can cause problems with chewing up too much memory.
Some category of bug fixes I have known • None of this is meant to be inclusive. It is merely meant as some examples of debugging and code fixing techniques I’ve used or seen used. • Some easy to fix bugs fall into the simple “off-by-one” errors or forgetting to take into account some end condition. The fix is usually localized and easily understood. I’ve sometimes have looked at code that has been running for a few years without problems only to see the bug and wonder how the system ever ran as long or as well as it did.
Cross module fixes • Some hard to fix bugs can span multiple modules in sometimes rather fragile code where the fix really needs to be thoroughly considered because its ramifications are not always well understood. • For example, security fixes often show up other problems • Speaking of security, security holes are bugs. A common security bug is not properly capturing and probing parameters. This can cause problems with both naive application programming errors and malicious applications. Simply capturing pointers and probing user buffers may not be enough to stop users from re-mapping memory.
Bugs older than sin • Sometimes a bug is located in very old code with reluctance on everyone’s part to want to touch the code. • For example, the bitmap package on NT was written back in 1989 and optimized for MM and file system allocation. Recently some people wanted to use the bitmap in a different way when searching for zero bits. They really need to think long and hard before altering this code. Beside outright breaking things, it could have serious performance implications.
Deadlocks • Deadlocks are where two or more threads and/or processes are competing for multiple locks and have tied each other up. • Deadlock problems once identified should in principle be easy to fix. • Having a set order for acquiring locks is important. For example, mutex levels are a great help if using mutexes. • But sometimes having a set order of acquisition is not always practical. • For example, page fault recursions can cause MM resources and software cache manager resources to be acquired recursively and out of order.
Priority Inversion • Priority inversion bugs are where a process of higher priority is waiting on a lock held by a process of lower priority. • To fix a priority inversion we sometimes need to identify the low priority thread holding a resource and give it a quantum priority boost.
Machine exceptions • Know the range of your parameters, underflow, overflow, and loss of precision can cause unanticipated problems. • For example, the triangle identification program mentioned in a previous lecture if not careful could suffer from an overflow problem. • Other common problems occur when the type of a counter is too small to hold all the possible values (e.g., a byte versus a short).
Sometimes we break all the rules to fix a bug • We might need to look into some hidden data structures to make things work. For example, the NT kernel does not formally export its wait queue structures, but calling wait can sometimes have disastrous affect. So ever so we have modules that glance at the queue behind the kernels back. • Fixes in this category are usually put in near the end of the project where the risk of applying a more appropriate fix is too great. A good example of this is when simple “fix-up” code is added to readjust a data structure that has gone awry.
Sometimes the fix is simply to mask over someone else’s problem • We might also break a clean design just to add special purpose code. Running legacy applications cause this to happen a lot. For example, allocating zero bytes should be illegal.
Defensiveprogramming • Defensive programming can fix or hide a lot of faults, and also identify problems. For example, setting pointers to null after freeing them will catch a lot of problems while the program is being debugged. Note, that this needs a shift in the usual programming API to handle freeing memory. You need to pass an address containing a pointer to the memory being freed.
An example • So instead ofFree( ptr );it is Free( &ptr ); • Macro’s can help with this. After it ships this could be turned into a defensive mechanism. Another example is encoding pointers so that errand code would cause a fault.
Labeling bugs in your code • Often times it is good to leave little bug labels in the your code for things that are not complete. • Some people use “bugbug” to highlight areas where code needs to be added or fixed • Others use “****” and some use their initials “gdk” • Whatever you use it is important to be consistent and communicate the standard.
Tracking bugs • Commonly tracked in a bug database (e.g., Microsoft uses what they call a “Raid” database) • Need a reproduction scenario • Need a history of fixes or resolutions • Other information • Who found the bug • How found • Severity • In which build • Priority of the bug (somewhat based on the severity) • Categorize bugs by module
More on tracking bugs • The database can gives an indication of the quality of the system • Number of total bugs • Incoming bug rate • Resolution rate • But beware of extrapolating too much… • Daily bug reports • Keep the developers focused • Keep managers informed • There comes a time in a project when only bug fixes should be allowed
Next time • Group design reviews