380 likes | 500 Views
Summarizing Procedures in Concurrent Programs. Shaz Qadeer Sriram K. Rajamani Jakob Rehof Microsoft Research. Motivation. How do you scale program analyses for sequential programs? Summarize at procedure boundaries Sharir-Pnueli ‘81, Reps-Horwitz-Sagiv ‘95
E N D
Summarizing Procedures in Concurrent Programs Shaz Qadeer Sriram K. Rajamani Jakob Rehof Microsoft Research
Motivation • How do you scale program analyses for sequential programs? • Summarize at procedure boundaries • Sharir-Pnueli ‘81, Reps-Horwitz-Sagiv ‘95 • Used in compiler dataflow analyses • Used in error detection tools • SLAM (Ball-Rajamani ‘00) • ESP (Das-Lerner-Seigle ‘02)
Summarization is efficient ! • Boolean program with: • g globals • n procedures, each with at most m locals • |E| = size of the CFG of the program • Complexity : O( |E| 2 O(g+m) ) • Complexity linear in the number of procedures!
Summarization gives termination! • Possibly recursive boolean programs • Infinite state systems • Checking terminates with summarization!
Question Can summarization help analysis of concurrent programs?
Difficulty Assertion checking for multithreaded programs is undecidable • Even if all variables are boolean • Further, even if only two threads! • Reduce emptiness of intersection of two CFLs to this problem(Ramalingam 00)
Our work • New model checking algorithm using summarization • useful for concurrent programs • Summaries provide re-use and efficiency for analyzing concurrent programs • Enable termination of analysis in a large class of concurrent programs • includes programs with recursion, shared variables and concurrency
Difficulties in summarizing concurrent programs • What is a summary? • For sequential programs • Summary of procedure P = Set of all pre-post state pairs (s,s’) obtained by invoking P • This doesn’t work for concurrent programs • Does not model concurrent updates by other threads
Insight • In a well synchronized concurrent program • A thread’s computation can be viewed as a sequence of transactions • While analyzing a transaction, interleavings with other threads need not be considered • Key idea: Summarize transactions!
How do you identify transactions? Lipton’s theory of reduction
x r=bal S2 S3 S4 r=bal x S2 T3 S4 z rel(this) r=bal y acq(this) x S5 S6 S7 S2 S3 S4 S0 S1 S2 rel(this) x acq(this) z y r=bal S2 S0 S5 T1 T6 S7 S2 T3 S4 Four atomicities • R: right movers • lock acquire • L: left movers • lock release • B: both right + left movers • variable access holding lock • N: non-movers • access unprotected variable
Transaction Any sequence of actions whose atomicities are in R*(N+)L* is a transaction R R R N R L L S5 S6 S7 S2 S0 S1 S3 S4 Transaction Precommit Postcommit
Transactions and summaries Corollary of Lipton’s theorem: No need to schedule other threads in the middle of a transaction If a procedure body occurs in a transaction, we can summarize it!
Resource allocator (1) bool available[N]; mutex m; int getResource() { int i = 0; L0: acquire(m); L1: while (i < N) { L2: if (available[i]) { L3: available[i] = false; L4: release(m); L5: return i; } L6: i++; } L7: release(m); L8: return i; } Choose N = 2 Summaries: <pc, i, m, (a[0],a[1])> <pc’, i’, m’, (a[0]’,a[1]’)> <L0, 0, 0, (0, 0)> <L8, 2, 0, (0,0)> <L0, 0, 0, (0, 1)> <L5, 1, 0, (0,0)> <L0, 0, 0, (1, 0)> <L5, 0, 0, (0,0)> <L0, 0, 0, (1, 1)> <L5, 0, 0, (0,1)>
What if transaction boundaries and procedure boundaries do not coincide? Two level model checking algorithm
Two level algorithm • First level maintains stack • Second level maintains stack-less summaries • Summaries can start and end anywhere in a procedure
Resource allocator (2) bool available[N]; mutex m[N]; int getResource() { int i = 0; L0: while (i < N) { L1: acquire(m[i]); L2: if (available[i]) { L3: available[i] = false; L4: release(m[i]); L5: return i; } else { L6: release(m[i]); } L7: i++; } L8: return i; } Choose N = 2 Summaries: <pc,i,(m[0],m[1]),(a[0],a[1]> <pc’,i’,(m[0]’,m[1]’),(a[0]’,a[1]’)> <L0, 0, (0,0), (0,0)> <L1, 1, (0,0), (0,0)> <L0, 0, (0,0), (0,1)> <L1, 1, (0,0), (0,1)> <L0, 0, (0,0), (1,0)> <L5, 0, (0,0), (0,0)> <L0, 0, (0,0), (1,1)> <L5, 0, (0,0), (0,1)> <L1, 1, (0,0), (0,0)> <L8, 2, (0,0), (0,0)> <L1, 1, (0,0), (0,1)> <L5, 1, (0,0), (0,0)> <L1, 1, (0,0), (1,0)> <L8, 2, (0,0), (1,0)> <L1, 1, (0,0), (1,1)> <L5, 1, (0,0), (1,0)>
Two level model checking algorithm: in pictures Lets first review the sequential CFL algorithm…
main( ) bar( ) bar()
main( ) bar( ) bar()
main( ) bar( ) bar()
Three kinds of summaries: • MAX • MAXCALL • MAXRETURN main( ) bar( ) MAXCALL MAX End of transaction bar() MAXRETURN MAXRETURN bar main main T1 T2
Concurrency + recursion int g = 0; mutex m; void foo(int r) { L0: if (r == 0) { L1: foo(r); } else { L2: acquire(m); L3: g++; L4: release(m); } L5: return; } Summaries for foo: <pc,r,m,g> <pc’,r’,m’,g’> <L0,1,0,0> <L5,1,0,1> <L0,1,0,1> <L5,1,0,2> Summaries for main: <pc,q,m,g> <pc’,q’,m’,g’> <M0,1,0,0> <M1,1,0,1> <M0,1,0,1> <M1,1,0,2> <M1,1,0,1> <M4,1,0,1> <M1,1,0,2> <M4,1,0,2> void main() { int q = choose({0,1}); M0: foo(q); M1: acquire(m) M2: assert(g >= 1); M3: release(m); M4: return; } P = main() || main()
What if the same procedure is called from different phases of a transaction? Instrument the transaction phase into the state of the program
Transactional context int gm = 0, gn = 0; mutex m, n; void bar() { N0: acquire(m); N1: gm++; N2: release(m); } void foo1() { L0: acquire(n); L1: gn++; L2: bar(); L3: release(n); L4: return; } void foo2() { M0: acquire(n); M1: gn++; M2: release(n); M3: bar(); M4: return; } P = foo1() || foo2()
Recap of technical problems • How do you identify transactions • Using the theory of reduction (Lipton ’75) • What if transaction boundaries do not coincide with procedure boundaries? • Two level model checking algorithm • First level maintains stack • Second level maintains stack-less summaries • Procedure can be called from different phases of a transaction • Instrument the transaction phase into the state of program
Termination • A function is transactional if no transaction ends in the “middle” of its exectution (includes all transitive callees) • Theorem: For concurrent boolean programs, if all recursive functions are transactional, then the algorithm terminates.
Sequential case • If we feed a sequential program to our algorithm it functions exactly like the Reps-Sagiv-Horwitz-POPL95 algorithm • Our algorithm generalizes the RHS algorithm to concurrent programs!
Related work • Summarizing sequential programs • Sharir-Pnueli ‘81, Reps-Horwitz-Sagiv ‘95, Ball-Rajamani ‘00 • Concurrency+Procedures • Bouajjani-Esparza-Touili ‘02 • Esparza-Podeslki ‘00 • Reduction • Lipton ‘75 • Qadeer-Flannagan ‘03
Automatic abstraction SLAM model checker Data flow analysis implemented using BDDs Finite state machines Push down model Boolean program FSM abstraction C data structures, pointers, procedure calls, parameter passing, scoping,control flow Source code Sequential C program
Zing model checker Rich control constructs: thread creation, function call, exception, objects, dynamic allocation Model checking is undecidable! abstraction Source code Device driver (taking concurrency into account), web services code
What is Zing? • Zing is a framework for software model-checking • Language, compiler, runtime, tools • Supports key software concepts • Enables easier extraction of models from code • Supports research in exploring large state spaces • Operates seamlessly with the VS.Net design environment
Current status • Summarization: • Theory: to appear in POPL 04 • Implementation: in progress • Zing: • Compiler, model checker and conformance checker operational • State-delta and transaction-based reduction implemented • Plans: • Symbolic reasoning • Automatic abstraction
Zing State Explorer BPEL4WS checking BPEL Processes Buyer Seller Zing Model Auction House Reg Service