Static Analysis and Modeling

Static Analysis and Modeling Tools which allows further checking of software systems

Dawson Engler, Benjamin Chelf, Andy Chou, and Seth Hallem. Checking System Rules Using System-Specific, Programmer-Written Compiler Extensions. OSDI 2000 Madanlal Musuvathi, David Y.W. Park, Andy Chou, Dawson R. Engler, David L. Dill. CMC: A pragmatic approach to model checking real code. ISCA 2001.

Issues • Programming tools find (simple) static errors; not useful for semantic errors. • Brunt force testing methodologies are not effective nor thorough when considering larger, more complex software systems. • The amount of effort towards identifying issues increases (exponentially?) as time moves onward.

More Issues • We really are not good at programming. • The psychology of the “master” programmer • Etc. (There are as many excuses for the incorrect as there are programmers.) • Software cannot be “verified”. The best we can hope for are sophisticate checks to unfold (more of) the errors in our code.

Meta Compilation • System implementers understand the semantics of the system better. • Compilers are better enforcers of rules that map well to the source code. • Therefore: MC involves integrating user provided systemic (semantic) rules to the compilation process.

MC Extensions • Uses “Metal”, a language for expressing a broad class of customized, static, bug-finding analyses. • xgcc, the analysis engine searches all execution path and applies extensions • Local analysis

Example sm free_checker { state decl { any_ptr } p; start: { free(p) } ==> p.freed ; p.freed: { *p } ==> p.stop, { err(“using %s after free!”, mc_identifier(p)); } | { free(p) } ==> p.stop, { err(“double free of %s!”, mc_identifier(p)); } ; } From: Seth Hallem, Benjamin Chelf, Yichen Xie, and Dawson Engler. A System and Language for Building System-Specific Static Analyses. PLDI 2002

Rule Templates

Memory Management • Check against null pointers • Unreclaimed memory checks • “Double free” instances checks • Use after deallocation checks

Global Checks Extension • The authors suggest useful checks performed on the whole code input: • Kernel code should not call blocking functions when holding a spin lock. (42/4) • Library modules should not call blocking functions until after the reference count is set properly. (53/2)

Other uses • Detection of race conditions and deadlocks: RacerX: effective, static detection of race conditions and deadlocks, Dawson Engler and Ken Ashcraft, In Proceedings of the Symposium on Operating Systems Principles, pages 237-253, October 2003

Transitioning… Any questions? (yawns?)

“Conventional” Model Checking • Modeling software is difficult at best, requiring abstract definition of software system. • Abstraction tends to minimize details of implementation. • Time consuming, manual process. • Memory intensive, usually exhausting system resources.

CMC – “C Model Checker” • Integrates with the code implementation • Process state includes global and local variables, heap, stack, and registers as well as shared memory • Optimizations to avoid unnecessary “state explosion problem” • Non-deterministic modeling supported • Can benefit on successive systems

CMC Steps • Correctness properties • Environment specification • Identify Initialization code and event handlers • Initial state generated using init functions • State generation • Correctness checks during model execution

State Space Explosion • Key to prolonging model execution • State caching to prevent reintroductions • Hash compaction (store small signature to represent each state) • Balance missing few errors in exchange to reducing state space • Down-scale model parameterizations • Heuristics to remove uninteresting states

The AODV Model • Use of interrupt driven event handlers fits well into the CMC modeling paradigm • 3 different implementations of routing protocol modeled • 34 distinct errors discovered, including one specification bug • (Mostly) shared modeling code

AODV Correctness Properties • General assertions (segmentation faults, memory leaks, dangling pointers) • All routing tables contain no loops • Routing table entries (a) one per node, (b) no route to self, valid hop count • Messages have valid hop counts (can’t be infinity), and reserved fields are zeroed.

AODV Environment • Uses unordered message queue • Message loss modeled with random queue deletions • Alternate wrapper function provide to send network packets • Stubs for 22 kernel functions and user-spaced socket buffer library

AODV: Initialization and Event Handling • The initialization code is clearly identified • Every signal handler mapped to a CMC “transition”

Example 1: int c; 2: mutex_t m; 3: 4: void Odd() { lock(m); if ((c%2) == 1) printf(“odd: %d\n”, c++); unlock(m); } 5: void Even() { lock(m); if ((c%2) == 0) printf(“even: %d\n”, c++); unlock(m); } 6: 7: int main() 8: { 9: c = 0; 10: init_mutex(m); 11: schedule(Odd); 12: schedule(Even); 13: 14: wait(5); 15:}

Conclusions • Static analysis tools are available which provide rules-based checking of code • Modeling can be used to identify more bugs under controlled executions with programs which “fit” the framework well. • “Finding bugs is easy, given the right approach” • The search for better means to “validate” software should continue; more lessons to come

Static Analysis and Modeling