410 likes | 692 Views
Atomicity: A powerful concept for analyzing concurrent software. Shaz Qadeer Microsoft Research. . . . . . . . . . . . . . . . . . . . . . . . . Concurrent programs. Thread 1. Processor 1. Thread 2. Thread 3.
E N D
Atomicity: A powerful concept for analyzing concurrent software Shaz Qadeer Microsoft Research
Concurrent programs Thread 1 Processor 1 Thread 2 Thread 3 Processor 2 Thread 4 • Operating systems, databases, web servers, • browsers, GUIs, web services • Modern languages: Java, C# • Cost of multiprocessing desktop < $2000
Reliable concurrent software? • Correctness problem • does program behaves correctly for allinputs and allinterleavings? • very hard to ensure with testing • Bugs due to concurrency are insidious • non-deterministic, timing dependent • data corruption, crashes • difficult to detect, reproduce, eliminate
t1=hits hits=t1+1 t2=hits hits=t2+1 hits=0 hits=2 hits=t2+1 t1=hits t2=hits hits=t1+1 hits=0 hits=1 t1=hits t2=hits hits=t2+1 hits=t1+1 hits=0 hits=1 Multithreaded program execution Thread 1 ... int t1 = hits; hits = t1 + 1 ... Thread 2 ... int t2 = hits; hits = t2 + 1 ...
Races in action • Power outage in northeastern grid in 2003 • Affected millions of people • Race in Alarm and Event Processing code • “We had in excess of three million online operational hours in which nothing had ever exercised that bug. I'm not sure that more testing would have revealed it.”-- GE Energy's Mike Unum
Thread 1 ... int t1 = hits; hits = t1 + 1 ... Thread 2 ... int t2 = hits; hits = t2 + 1 ... Race conditions A race condition occurs if two threads access a shared variable at the same time, and at least one of the accesses is a write
Thread 1 synchronized(lock) { int t1 = hits; hits = t1 + 1 } Thread 2 synchronized(lock) { int t2 = hits; hits = t2 + 1 } acq t1=hits hits=t1+1 rel acq t2=hits hits=t2+2 rel hits=2 hits=0 Preventing race conditions using locks • Lock can be held by at most one thread • Race conditions are prevented using locks • associate a lock with each shared variable • acquire lock before accessing variable
Race detection • Static: • Sterling 93, Aiken-Gay 98, Flanagan-Abadi 99, Flanagan-Freund 00, Boyapati-Rinard 01, von Praun-Gross 01, Boyapati-Lee-Rinard 02, Grossman 03 • Dynamic: • Savage et al. 97 (Eraser tool) • Cheng et al. 98 • Choi et al. 02
Race-free bank account int balance; void deposit (int n) { synchronized (this) { balance = balance + n; } }
balance = 10 Thread 1 deposit(10); Thread 2 withdraw(10); Race-free bank account int balance; void deposit (int n) { synchronized (this) { balance = balance + n; } } int read( ) { int r; synchronized (this) { r = balance; } return r; } void withdraw(int n) { int r = read( ); synchronized (this) { balance = r – n; } } Race-freedom not sufficient!
Atomic bank account int balance; void withdraw(int n) { synchronized (this) { balance = balance – n; } } void deposit (int n) { synchronized (this) { balance = balance + n; } } int read( ) { int r; synchronized (this) { r = balance; } return r; }
java.lang.StringBuffer (jdk 1.4) “String buffers are safe for use by multiple threads. The methods are synchronized so that all the operations on any particular instance behave as if they occur in some serial order that is consistent with the order of the method calls made by each of the individual threads involved.”
java.lang.StringBuffer is buggy! public final class StringBuffer { private int count; private char[ ] value; . . public synchronized StringBuffer append (StringBuffer sb) { if (sb == null) sb = NULL; int len = sb.length( ); int newcount = count + len; if (newcount > value.length) expandCapacity(newcount); sb.getChars(0, len, value, count); //use of stale len !! count = newcount; return this; } public synchronized int length( ) { return count; } public synchronized void getChars(. . .) { . . . } }
Atomicity • A method is atomic if it seems to execute “in one step” even in presence of concurrently executing threads • Common concept • “(strict) serializability” in databases • “linearizability” in concurrent objects • “thread-safe” multithreaded libraries • “String buffers are safe for use by multiple threads. …” • Fundamental semantic correctness property
x y acq(this) r=bal bal=r+n rel(this) z acq(this) x r=bal y bal=r+n z rel(this) • Non-serialized executions of deposit acq(this) x y r=bal bal=r+n z rel(this) Definition of atomicity • Serialized execution of deposit • deposit is atomic if for every non-serialized execution, there is a serialized execution with the same behavior
Reduction (Lipton 75) acq(this) x r=bal y bal=r+n z rel(this) S0 S1 S2 S3 S4 S5 S6 S7 acq(this) y r=bal bal=r+n z rel(this) x S0 S1 S2 T3 S4 S5 S6 S7 x acq(this) y r=bal bal=r+n z rel(this) S0 T1 S2 T3 S4 S5 S6 S7 blue thread holds lock red thread does not hold lock operation y does not access balance operations commute x y acq(this) r=bal bal=r+n z rel(this) S0 T1 T2 T3 S4 S5 S6 S7 blue thread holds lock after acquire operation x does not modify lock operations commute x y acq(this) r=bal bal=r+n rel(this) z S0 T1 T2 T3 S4 S5 T6 S7
x r=bal S2 S3 S4 r=bal x S2 T3 S4 z rel(this) r=bal y acq(this) x S5 S6 S7 S2 S3 S4 S0 S1 S2 rel(this) x acq(this) z y r=bal S2 S0 S5 T1 T6 S7 S2 T3 S4 Four atomicities • R: right commutes • lock acquire • L: left commutes • lock release • B: both right + left commutes • variable access holding lock • N: atomic action, non-commuting • access unprotected variable
R* . x . N . Y . L* S0 S5 R* . . . Y x . N L* S0 S5 ; B L R N C B B L R N C R R N R N C L L L C C C N N N C C C C C C C C C Sequential composition • Use atomicities to perform reduction • Lipton: sequence (R+B)*;(N+); (L+B)* is atomic R; B ; N; L ; N R N R;N;L ; R;N;L ; N N C
N N N Bank account int balance; /*# guarded_by this */ /*# atomicity N */ void withdraw(int x) { int r = read( ); acquire(this); balance = r – x; release(this); } /*# atomicity N */ void deposit (int x) { acquire(this); int r = balance; balance = r + x; release(this); } /*# atomicity N */ int read( ) { int r; acquire(this); r = balance; release(this); return r; } R B B L N R B L R B L B
N N N Bank account int balance; /*# guarded_by this */ /*# atomicity N */ void deposit (int x) { acquire(this); int r = balance; balance = r + x; release(this); } /*# atomicity N */ int read( ) { int r; acquire(this); r = balance; release(this); return r; } /*# atomicity N */ void withdraw(int x) { acquire(this); int r = balance; balance = r – x; release(this); } R B B L R B L B R B B L
Soundness theorem • Suppose a non-serialized execution of a well-typed program reaches state S in which no thread is executing an atomic method • Then there is a serialized execution of the program that also reaches S
Atomicity checker for Java • Leverage Race Condition Checker to check that protecting lock is held when variables accessed • Found several atomicity violations • java.lang.StringBuffer • java.lang.String • java.net.URL
“String buffers are atomic” “String buffers are safe for use by multiple threads. The methods are synchronized so that all the operations on any particular instance behave as if they occur in some serial order that is consistent with the order of the method calls made by each of the individual threads involved.”
More work on atomicity checking • Dynamic analysis (Wang-Stoller03, Flanagan-Freund 04) • Model checking (Robby et al. 04)
So far… • Atomicity as a lightweight and checkable specification
Now… • Atomicity for precise and efficient analysis of concurrent programs Why is precise analysis of concurrent programs difficult?
The problem • Given a concurrent boolean program with assertions, does the program ever go wrong by failing an assertion? • Abstract interpretation • Cousot-Cousot 77, Graf-Saidi 97 • unbounded data boolean data
Concurrent boolean program without procedures • k = size of CFG of the program • n = number of threads • Need to analyze all interleavings of various threads • complexity proportional to kn
Concurrent boolean program with procedures • Ramalingam 00: The problem is undecidable, even with only two threads • two unbounded stacks • reduction from the undecidable problem “Is the intersection of two context-free languages empty?”
R* . x . N . Y . L* S0 S5 R* x . . . . Y N L* S0 S5 Atomic blocks to the rescue! Lipton: any sequence (R+B)*; (N+); (L+B)* is an atomic block Other threads need not be scheduled in the middle of an atomic block
First idea • Infer maximal atomic blocks
Concurrent boolean program without procedures • k = size of CFG of the program • n = number of threads • a = size of CFG of inferred atomic blocks • Need to analyze all interleavings of various threads but only at atomic block boundaries • complexity proportional to (k/a)n
Second idea • Summarize inferred atomic blocks • Inspired by summarization of procedures in sequential programs
Summarization for sequential programs(Sharir-Pnueli 81, Reps-Horwitz-Sagiv 95) int x; void incr_by_2() { x++; x++; } void main() { … x = 0; incr_by_2(); … x = 0; incr_by_2(); … x = 1; incr_by_2(); … } x x’ 0 2 1 3 • Bebop, ESP, Moped, MC, Prefix, …
Assertion checking for sequential programs • Given a sequential boolean program with assertions, does the program ever go wrong by failing an assertion? • Boolean program with: • g = number of global vars • l = max. number of local vars in any scope • k = size of the CFG of the program • Complexity is O( k 2 O(g+l) ),linear in the size of CFG • Summarization enables termination in the presence of recursion
Call P Return P Summarization in concurrent programs • Unarticulated so far • Naïve extension of summaries for sequential programs do not work
Second idea • Summarize inferred atomic blocks • Summary of procedure = summary of constituent atomic blocks • Often procedure is single atomic block • In Atomizer benchmarks (Flanagan-Freund 04), majority of procedures are atomic
Concurrent boolean program with procedures • Ramalingam 00: The problem is undecidable, even with only two threads. • Qadeer-Rajamani-Rehof (POPL 04): The problem is decidable, if all recursive procedures are atomic. • For a sequential program, the whole execution is an atomic block • Algorithm behaves exactly like classic interprocedural dataflow analysis (Sharir-Pnueli 81)
Model checker for concurrent software • Implementation of atomic block inference and summarization • Applications • Concurrent systems code, e.g., device drivers • Web services • Spec# (C# + specifications)
Atomicity as a language primitive • First proposed in the 70s • Tony Hoare • David Lomet • Hardware implementation • Rajwar-Goodman 02 • Software implementation • Herlihy-Luchangco-Moir-Scherer 03 • Harris-Fraser 03 • Welc-Jagannathan-Hosking 04 • MIT (Martin Rinard’s group)
Conclusions • Atomicity is a useful concept for analyzing concurrent programs • Lightweight specification • Simplifies formal and informal reasoning • Enables precise and efficient analysis • Perhaps the right synchronization primitive for future concurrent languages?