300 likes | 400 Views
Bebop: A Path-sensitive Interprocedural Dataflow Analysis Engine. Thomas Ball Sriram K. Rajamani. http://research.microsoft.com/slam/. What is Bebop?. Model checker for boolean programs A generic dataflow engine. What are Boolean Programs?. C program with just boolean variables
E N D
Bebop: A Path-sensitive Interprocedural Dataflow Analysis Engine Thomas Ball Sriram K. Rajamani http://research.microsoft.com/slam/
What is Bebop? • Model checker for boolean programs • A generic dataflow engine
What are Boolean Programs? • C program with just boolean variables • Used to represent abstractions of C programs • Language for encoding dataflow problems
Agenda • SLAM and boolean programs • Bebop – internals • Current status and future plans
Checking API Usage Application Does an application follow the “proper usage” rules of an API? API C lib | DLL | COM |…
State MachineFor Locking state { int locked = 0; } Lock.call { if (locked==1) abort; else locked = 1; } UnLock.call { if (locked==0) abort; else locked = 0; } U L Unlocked Locked Error L U
Goal: Run the state machine through all paths in the program Problem: Too many paths! Solution: Data flow analysis Problem : False alarms Solution : Better abstraction
False alarm do { //get the write lock KeAcquireSpinLock( &devExt->writeListLock ); nPacketsOld = nPackets; request = devExt->WriteListHeadVa; if(request && request->status){ devExt->WriteListHeadVa = request->Next; KeReleaseSpinLock(&devExt->writeListLock); ... nPackets++; } } while (nPackets != nPacketsOld); KeReleaseSpinLock(&devExt->writeListLock);
False alarm do { //get the write lock KeAcquireSpinLock( &devExt->writeListLock ); nPacketsOld = nPackets; request = devExt->WriteListHeadVa; if(request && request->status){ devExt->WriteListHeadVa = request->Next; KeReleaseSpinLock(&devExt->writeListLock); ... nPackets++; } } while (nPackets != nPacketsOld); KeReleaseSpinLock(&devExt->writeListLock);
Abstraction do { //get the write lock KeAcquireSpinLock( &devExt->writeListLock ); nPacketsOld = nPackets;b := true; request = devExt->WriteListHeadVa; if(request && request->status){ devExt->WriteListHeadVa = request->Next; KeReleaseSpinLock(&devExt->writeListLock); ... nPackets++;b := b ? false : *; } } while (nPackets != nPacketsOld); KeReleaseSpinLock(&devExt->writeListLock); b b b !b Boolean variable b represents the condition (nPacketsOld == nPackets)
Abstraction do { //get the write lock KeAcquireSpinLock( &devExt->writeListLock ); nPacketsOld = nPackets; b := true; request = devExt->WriteListHeadVa; if(request && request->status){ devExt->WriteListHeadVa = request->Next; KeReleaseSpinLock(&devExt->writeListLock); ... nPackets++; b := b ? false : *; } } while (nPackets != nPacketsOld); KeReleaseSpinLock(&devExt->writeListLock); b b b !b Boolean variable b represents the condition (nPacketsOld == nPackets)
Boolean program do { //get the write lock KeAcquireSpinLock(); b := true; if(*){ KeReleaseSpinLock(); ... b := b ? false : *; } } while ( !b ); KeReleaseSpinLock(); b b b !b Boolean variable b represents the condition (nPacketsOld == nPackets)
C program Spec. SLIC GOLF predicates Boolean program CFG + VFG c2bp bebop predicates Pass newton Fail, p GUI Error
Agenda • SLAM and boolean programs • Bebop – internals • Current status and future plans
Interprocedural dataflow analysis • [Sharir-Pnueli, 1981] • “functional approach” • (v) = maps facts from “entry” to facts at v • inefficient algorithm • [Reps-Horwitz-Sagiv, POPL’95] • construct “exploded supergraph” • product of CFG and dataflow facts • efficient refinement of Sharir-Pnueli • iterative addition of edges to graph • “path edges”: <entry,d1> -> <v,d2> • “summary edges”: <call,d1> -> <return,d2>
decl g; void main() decl u,v; [1] u := !v; [2] equal(u,v); [3] if (g) then R: skip; fi [4] return; [1] [1] [1] [1] u!=v g=0 [2] [2] [3] u=v g=0 [4] u!=v g=1 void equal(a, b) [5] if (a = b) then [6] g := 1; else [7] g := 0; fi [8] return; [5] [5] u=v g=1 [7] [7] [8]
Symbolic dataflow analysis • Do not construct explicit “exploded supergraph” • Represent “path edges” and “summary edges” implicitly (using BDDs) • [Ball, Rajamani, SPIN 2000]
0 1 1 0 1 0 0 1 1 0 0 1 0 1 1 0 Binary Decision Diagrams [Bryant] Ordered decision tree for f = a b c d a 0 1 b b 1 0 1 0 c c c c 1 0 1 0 1 1 0 d d d d d d d d
OBDD reduction a f = a b c d 0 1 b b 1 1 0 0 c c 1 1 0 0 d d 1 1 0 0 0 1
Difficulty • Do not want to encode control flow graph with BDDs • Solution: • No need to represent the CFG “source” • <entry,d1> -> <v,d2> • Partition path edges by their “target”! • PE(v) = { <d1,d2> | <entry,d1> -> <v,d2> } • use a BDD to represent PE(v)
decl g; void main() decl u,v; [1] u := !v; [2] equal(u,v); [3] if (g) then R: skip; fi [4] return; [1] [1] [1] [1] u!=v g=0 [2] [2] [3] u=v g=0 [4] u!=v g=1 void equal(a, b) [5] if (a = b) then [6] g := 1; else [7] g := 0; fi [8] return; [5] [5] u=v g=1 [7] [7] [8]
decl g; void main() decl u,v; [1] u := !v; [2] equal(u,v); [3] if (g) then R: skip; fi [4] return; [1] [1] [1] [1] u!=v g=0 [2] [2] [3] u=v g=0 [4] u!=v g=1 void equal(a, b) [5] if (a = b) then [6] g := 1; else [7] g := 0; fi [8] return; [5] [5] u=v g=1 [7] [7] [8]
decl g; void main() decl u,v; [1] u := !v; [2] equal(u,v); [3] if (g) then R: skip; fi [4] return; g=g’& u=u’& v=v’ [1] [1] [1] [1] g=g’& v!=u’ & v=v’ [2] [2] [3] g’=0 & v!=u’ & v=v’ g’=0 & v’!=u’ & v=v’ [4] void equal(a, b) [5] if (a = b) then [6] g := 1; else [7] g := 0; fi [8] return; g=g’& a=a’ & b=b’& a!=b [5] [5] g=g’ & a=a’ & b=b’ & a!=b [7] [7] g’=0 & a=a’ & b=b’& a!=b [8]
Boolean Programs • A target language for encoding finite interprocedural program analyses • All the control-flow primitives of C, including procedures and recursion • direct mapping of control-flow • ease programming of DFA • Boolean variables represent finite sets of dataflow facts • implicit three-valued logic via BDDs
Nullness analysis void foo( int *p, int *q) if( p != NULL) q = p; if( p != NULL) *q = *p + 1; void foo( bool bp, bool bq) if(bp) bq = bp; if(bp) skip Boolean program C program
Performance • Complexity: O(E * 2O(N)) • E is the size of the CFG • N is the max. number of variables in scope • Some execution times • 300 line C driver in 9 seconds • 37 predicates • 1,100 line boolean program • 300 boolean variables • 7,000 line C driver in 8 seconds • 22 predicates • 16,000 line boolean program • 400 boolean variables
Conclusions • Boolean programs – a generic model • Abstractions in model checking software • Adding predicate sensitivity to dataflow • Bebop – a symbolic dataflow analyzer • Explicit representation of CFG • Implicit representation of path edges and summary edges • Generation of hierarchical error traces
In progress • Separate compilation • Combine top-down and bottom-up • Precision/efficiency tradeoff • Cartesian abstraction • Partition the BDDs • don’t track all correlations • large amount of work to draw upon from symbolic model checking
Software Productivity Tools Microsoft Research http://research.microsoft.com/slam/