320 likes | 414 Views
Software Sermon. Management, Programming, Testing, CVS. Software is Hard. The difficulty of a program goes up roughly with the square of its size. Since software is so flexible there is always the temptation to add a little more.
E N D
Software Sermon Management, Programming, Testing, CVS
Software is Hard • The difficulty of a program goes up roughly with the square of its size. • Since software is so flexible there is always the temptation to add a little more. • While some algorithms are difficult, usually the hardest part is getting all the pieces to work together, and keeping them working.
Every Software Project Needs: • A sense of the users • Wise managers • Pragmatic programming • Thorough testing
Constant Contact with Users • It’s best to be constantly in contact with three different users. • Strive to become a user yourself. Learn user’s other tools. • A web browser provides a familiar user interface, and access to tools anywhere. • Users are a great help for testing as well as an endless source of ideas. • You can’t implement every idea.
Bioinformatics Users • Willing to read and dig a little • Biologists, medical doctors, programmers • Macintosh, PC, Unix • Nearly universally comfortable with web • May be experts in some areas, know little of other areas.
Every User Needs • Reliability • Speed • Ability to exchange data with other programs • Consistent user interface
Wise Management • Many ways to do it right. • Mistakes can be quite costly. • Have to deal with • People issues • Resource issues • Project issues • Read ‘The Mythical Man Month,’ ‘Peopleware,’ ‘Extreme Programming,’ ‘Pragmatic Programming’
People Issues • Finding the right people • Introducing new people to the work group • Training: the big picture and where to focus • Coaxing the silent to talk, the talkers to listen • Avoiding unnecessary interruptions • Reassigning jobs when (and only when) needed. • Credit where credit is due • Keeping work varied and interesting • Accepting there is life outside of work
Pragmatic Programming • Use consistent conventions. • Build and test from the bottom up. • Write for readability. • Keep everything as local as possible. • Defend against bad input. • Learn the fine art of debugging. • Modularize, but don’t over-modularize. • Sometimes computers need to reboot. • Sometimes programmers need to start over.
Consistency and Conventions • Code is constrained by few natural laws. • There are many ways to do things, so programmers make arbitrary decisions. • Arbitrary decisions are hard to remember. • Conventions make decisions less arbitrary. • varName vs. VarName vs varname vs var_name – pick one and stick to it • variable vs. var vs. vrbl vs. vble vs varible: if you need to abbreviate, keep it short.
Write for Readability • Use descriptive names. Try and keep them short. • Set your tab stops to 8 and indent cleanly!! • Comment each module and subroutine with an overview of what they do and if need be how they do it. • It’s not so important to comment line by line except in unusual situations that are often perhaps best avoided entirely. • Most subroutines should take up less than a single screen. Break larger subroutines into logical blocks and comment the start of each block.
A Header File /* boxClump - put together 2 dimensional boxes that * overlap with each other into clumps. */ #ifndef BOXCLUMP_H #define BOXCLUMP_H struct boxIn /* Input to box clumper. */ { struct boxIn *next; /* Next in list. */ int qStart, qEnd; /* Range covered in query. */ int tStart, tEnd; /* Range covered in target. */ void *data; /* Some user-associated data. */ }; struct boxClump *boxFindClumps(struct boxIn **pBoxList); /* Convert list of boxes to a list of clumps. Clumps * are collections of boxes that overlap. Note that * the original boxList is overwritten as the boxes * are moved from it to the clumps. */
A Simple Short Routine struct hashEl *hashLookup(struct hash *hash, char *name) /* Looks for name in hash table. Returns associated element, * if found, or NULL if not. */ { struct hashEl *start = hash-table[hashCrc(name)& hash->mask]; struct hashEl *el; for (el = start; el != NULL; el = el->next) { if (sameString(el-name, name)) break; } return el; }
Part of a Longer Routine struct mafAli *mafNext(struct mafFile *mf) /* Return next alignment in FILE or NULL if at end. */ { struct lineFile *lf = mf->lf; struct mafAli *ali; char *line, *word; /* Loop until get an alignment paragraph or reach end of file. */ for (;;) { /* Get header line. If it's not there assume end of file. */ if (!nextLine(lf, &line)) { lineFileClose(&mf-lf); return NULL; } /* Parse alignment header line. */ word = nextWord(&line); if (word == NULL) continue; /* Ignore blank lines. */ if (sameString(word, "a")) {
Another Header File struct dlNode /* An element on a doubly linked list. */ { struct dlNode *next; /* Pointer to next element. */ struct dlNode *prev; /* Pointer to previous element. */ void *val; /* Pointer to item in this node. */ }; struct dlList /* A doubly linked list. */ { struct dlNode *head; /* First member in list. */ struct dlNode *nullMiddle; /* Always NULL, shortens code. */ struct dlNode *tail; /* Last member in list. */ }; void dlRemove(struct dlNode *node); /* Removes a node from list. */ void dlAddTail(struct dlList *list, struct dlNode *newNode); /* Add a node to tail of list. */ int dlCount(struct dlList *list); /* Return length of list. */
Bottom Up Implementation • Start with lowest level code. It can be tested independently. • It’s much easier to write and test code if you can trust the routines the code calls. • Code needs to be tested in full immediately after it is written, even while it is written. • If not all of a new module’s capabilities are immediately used, write explicit test code for them or wait to implement them until needed.
Keep It Local • It’s easier to understand and modify programs that primarily deal with local variables. • Keep things local to a subroutine, module or object whenever possible. • Occasionally a global object that is set in one place and read-only elsewhere is ok. • Putting data in an object as opposed to global or module-level variables is usually good. • Inheritance delocalizes object-oriented programs. Use inheritance with great care.
Code Defensively • Check inputs, especially in library routines. • Sprinkle ‘asserts’ through your program to make sure it is behaving as you think it should. • It’s better to fail hard, fast, and consistently than to limp along erratically. Throw/catch. • Turn on compiler warnings, especially for uninitialized variables and missing returns.
Debugging: the Easy Stuff • Single stepping through a program the first time it is written is very worthwhile. • Single stepping while debugging is often a waste of time. • The cause of many bugs is obvious once you find out where the program died. • Stack traces are quick, often helpful. • ‘top’ can tell you if program is eating all memory or is stuck in a loop. • Check out the easy, obvious stuff first.
Debugging: the Hard Stuff • Isolate the minimum input needed to make the bug happen. • Run the program on different inputs to help define the boundaries of the bug. • Treat a bug as a logic puzzle, not an embarrassment. Nobody’s perfect! • Put in print statements (possibly to a log file if the volume is high) to make sure program is behaving as you think it should. • If it’s not a typo, it may well be a conceptual bug of some sort.
Debugging Dynamic Memory • Writing outside of bounds in dynamically allocated memory can create crashes later. • Writing to memory after it’s freed, or freeing memory twice causes similar problems. • Making malloc/new put sentinal values at start and end, and keep a list of allocated blocks can be invaluable. Free/delete can check these as can a ‘checkHeap’ routine.
Modularize, But Not Too Much • Keeping track of 100,000 lines of code is hard. • Keeping track of 1,000 modules is even harder. • Ideally a module encompasses a logically related group of functions with a relatively simple unchanging interface. • If interface needs constant changing it may reflect over-modularization or grouping together things that don’t really belong together. • It’s safer to add a new routine than to bend an old one.
When to Start Over • A rewrite is often easier than an extensive modification, at least if the interfaces are clean. • When doing a new thing of any complexity, frequently you have to throw away the first attempt or two. • Think twice about a rewrite if the interface is large or ill defined, and you have to maintain backwards compatibility.
Thorough Testing • Programmers test the code as it is written. • Crucial code should be reviewed and tested by a second programmer. • As soon as there’s a working skeleton, the program should be regularly tested by non-programming staff. • For several months before product launch it’s worthwhile to have 3-6 testers from the user community, and at least as many in-house testers as programmers.
Modifying Existing Code • Always start from a compiling/working base. • Before changing functionality: • Obtain a test suite of existing functionality; • Edit to increase locality as much as possible; • Consolidate code into smaller subroutines with defined inputs and outputs; • Read comments and parenthesize misleading ones. • As much as possible limit changes inside of existing routines to a few lines. • Rename things you modify to force yourself to visit all places they are used.
Implementing a Small Change • CVS update source tree to get in sync with everyone else. Some people have cron do this every morning at 3:00am. • Test code in your region before starting. • Implement small change onto hgwdev-user. • Test code in your region. • Do a quick overall test of program. • CVS commit. CVS update and resolve conflicts if necessary. • Test code in your region. • Do a quick overall test of program • Tell people and/or RT about change and compile it to genome-test.
Implementing a Medium Change • Let genecats know what your planning to do. • Cvs update at beginning. If it takes more than a day carefully cvs update at least once a week, testing locally and globally before and after each update. • When you think you’re done and it passes all your tests, advise Heather and ask her to test on hgwdev-user. • When Heather thinks its ok do one last cvs update. Test. Cvs commit. Test. Make on genome-test. Announce on genecats.
Implementing a Large Change • Make sure that Donna and genecats in general agree that it is a good thing to do. • Create a branch under CVS to isolate what you’re working on from other developers. Contact Mark D. for help with branching. • Test as you go as always. Be careful not to mix up branch code with main code. • When large feature is implemented invite the usual suspects to admire and test it before merging in the branch. • Work with a senior developer (Angie, Matt, Mark D., Chuck or me) to merge the branch.
The Release Cycle • Most developers most of the time are working towards improving the code on ‘genome-test’. • Periodically a ‘release branch’ of the code is made, currently by Matt, tested extra hard, and then moved to ‘hgwbeta’, and eventually to ‘genome’. • Before release branch try to polish and test your code and be sure to commit it. • Between branching and actual release please focus on helping test other people’s code at least as much as developing new features.
Conclusions • Building and maintaining a large multiprogrammer project is a skill nobody is born with. • Keeping people informed what you are up to and treating them with respect is essential. • Readable modular code and sensible version control and testing can keep things working as they grow.