250 likes | 404 Views
Goldilocks: A Transaction and Race - A ware Runtime for Java. Serdar Tasiran Koç University İstanbul, Turkey. Tayfun Elmas Koç University İstanbul, Turkey. Shaz Qadeer Microsoft Research Redmond, WA. Dagstuhl Seminar January 3, 200 7.
E N D
Goldilocks: A Transaction and Race-Aware Runtimefor Java Serdar Tasiran Koç University İstanbul, Turkey Tayfun Elmas Koç Universityİstanbul, Turkey Shaz Qadeer Microsoft Research Redmond, WA Dagstuhl Seminar January 3, 2007
Concurrency will become more important because many, many cores in a single chip are coming • The VIEW and RAMP projects (Berkeley, Stanford, MIT, CMU, UW, UT Austin, processor companies, ...) • What will the future processor chip look like? • From white papers and web sites of these projects: • “Conventional wisdom is now to double the number of cores on a chip with each silicon generation.” • “The target should be 1000s of cores per chip, as this hardware is the most efficient in MIPS per watt, MIPS per area of silicon, and MIPS per development dollar.” • “To maximize programmer productivity, programming models should be independent of the number of processors.” • “To maximize application efficiency, programming models should support a wide range of data types and successful models of parallelism: data-level parallelism, independent task parallelism, and instruction-level parallelism.”
More cores more concurrency • Programmers will be forced to make more intensive use of concurrency • More potential for concurrency-related errors • Transactions will not solve all concurrency-related problems • "offline concurrency" - concurrency that spans transactions will always be needed [Martin Fowler, Patterns for Enterprise Application Architecture] • Other concurrent programming mechanisms will co-exist with transactions • Concurrency errors: • Non-deterministic, difficult to reproduce, diagnose • BUT: More computational power available for runtime checks! • Static pre-elimination of concurrency errors expensive, sometimes not possible • Deployed code will contain concurrency bugs
Runtime Exceptions for Concurrency Errors • Research direction: Runtimes that • detect concurrency-related errors • race conditions, atomicity violations, refinement violations • bring errors to the attention of programmer: • ConcurrencyException(a special kind of RuntimeException) • Programmer explicitly specifies what to do when concurrency error is detected. • Not as straightforward to define and implement as NullPointerException or ArrayIndexOutOfBoundsException
Our recent focus Race conditions: Often symptomatic of concurrency-related errors We built a runtime that provides a precise DataRaceException: • like NullPointerException, ArrayIndexOutOfBoundException • thrown when a data race is about to happen
Why we check for races at runtime? • Precise static detection undecidable • Conservative checks require manual effort to rule out false alarms • Not possible for open programs. The Java Memory Model [Manson et.al, POPL’05]: • Correctly synchronized programs = Race-free programs • Race-freedom Program runs with sequentially-consistent semantics • Semantics for race-y programs: Complicated, causality-based definition • Only for compiler and virtual machine designers • DataRaceException: Warning to programmer about poor synchronization
APACHE FTP SERVER CONNECTION HANDLING public void run() { public void close() { INITIALIZETHE CONNECTION CLOSE CONNECTION 2 // check whether already closed //or not 3 synchronized(this) { 4 if(m_isConnectionClosed) 5 return; 6 m_isConnectionClosed = true; 7 } 8 ... 2 ... 3 dereference m_request, m_writer, m_reader 4 and m_controlSocket to initialize the connection 5 ... while(connection is alive){ READ NEXT REQUEST PERFORM REQUESTED ACTION (USINGREQUEST HANDLER’S FIELDS) } CLEAR FIELDS (SET FIELDS TO NULL) 6 while(!m_isConnectionClosed){ 7 String commandLine = m_reader.readLine(); 8 if(commandLine == null) { 9 break; 10 } 11 commandLine = commandLine.trim(); 12 if(commandLine.equals("")) { 13 continue; 14 } 15 m_request.parse(commandLine); 16 if(!hasPermission()) { 17 m_writer.send(530, "permission", null); 18 continue; 19 } 20 // execute command 21 service(m_request, m_writer); 22 } 9 m_request = null; 10 m_writer = null; 11 m_reader = null; 12 m_controlSocket = null; ... These fields are shared Shared Fields
BUG MANIFESTATION Connection thread Close Thread public void run() { public void close() { 2 // check whether already closed //or not 3 synchronized(this) { 4 if(m_isConnectionClosed) 5 return; 6 m_isConnectionClosed = true; 7 } 8 ... 2 ... 3 dereference m_request, m_writer, m_reader 4 and m_controlSocket to initialize the connection 5 ... Detects thatconnection is alive 6while(!m_isConnectionClosed){ 7 String commandLine = m_reader.readLine(); 8 if(commandLine == null) { 9 break; 10 } 11 commandLine = commandLine.trim(); 12 if(commandLine.equals("")) { 13 continue; 14 } 15 m_request.parse(commandLine); 16 if(!hasPermission()) { 17 m_writer.send(530, "permission", null); 18 continue; 19 } 20 // execute command 21 service(m_request, m_writer); 22 } 9 m_request = null; 10 m_writer = null; 11 m_reader = null; 12 m_controlSocket = null; ... Null Pointer Exception
A Precise, Efficient Race-Detection Algorithm • To implement DataRaceException • Need precise race detection algorithm. • False alarms not acceptable in this context. • Goldilocks: Efficiently Computingthe Happens-Before Relation Using Locksets[Elmas, Qadeer, Tasiran, FATES/RV’06]
The Goldilocks algorithm [FATES/RV ’06] • Novel lockset-based characterication of the happens-before relation • As efficient as other lockset algorithms • As precise as vector-clocks • Uniformly captures all synchronization disciplines • Our locksets contain locks, volatile variables, thread ids Theorem: When thread t accesses variable d, there is no race iff Lockset of d at that point contains t • Sound: Detects all apparent races that occur in execution • Precise: Race reported Two accesses not ordered by happens-before • No false alarms, no alarms about potential races
Goldilocks intuition • LS: (Variables) (Threads Locks Volatiles) • Update rules maintain invariants: • Thread t LS(d) t is owner of d • Accesses to d by t are race-free • Lock l LS(d) acquire l to become owner of d • Volatile v LS(d) read v to become owner of d • When t accesses d: Race-free iff (t LS(d)) • After t accesses d: LS(d) = { t } • t is the only owner of d • Other threads: Must synchronize with t • In order to become an owner of d
Lockset update rules • Ownership transfer between threads • LS(d) grows through synchronization actions • release(l) by t For each variable d: if (t LS(d)) (add l to LS(d)) • acquire(l) by t For each variable d: if (l LS(d)) (add t to LS(d)) • volatile-write(v) by t For each variable d: if (t LS(d)) (add v to LS(d)) • volatile-read(v) by t For each variable d: if (v LS(d)) (add t to LS(d)) • fork(s) by t For each variable d: if (t LS(d)) (add s to LS(d)) • join(s) by t For each variable d: if (s LS(d)) (add t to LS(d))
o2 o1 a a o1 o2 b b Example T1T2T3 class IntBox { int x; } a := IntBox() L1 b := IntBox() acquire(L1) L2 a.x ++ Global Variablesa, b: IntBoxo1.x, o2.x: int release(L1) acquire(L1) acquire(L2) tmp:= a a := b b := tmp release(L1) L1 release(L2) acquire(L2) L2 b.x ++ release(L2)
First access LS(o1.x) = {T1} (T1 LS) (add L1 to LS) LS(o1.x) = {T1, L1} (L2 LS) (add T2 to LS) (L1 LS) (add T2 to LS) LS(o1.x) = {T1, L1, T2} LS(o1.x) = {T1, L1, T2} (T2 LS) (add L2 to LS) (T2 LS) (add L1 to LS) LS(o1.x) = {T1, L1, T2, L2} LS(o1.x) = {T1, L1, T2} (L2 LS) (add T3 to LS) LS(o1.x) = {T1, L1, T2, L2, T3} (T3 LS) (No race) LS(o1.x) = {T3} (T3 LS) (add L2 to LS) LS(o1.x) = {T3, L2} Goldilocks a := IntBox() LS(o1.x) = b := IntBox() T1 acquire(L1) a.x ++ release(L1) acquire(L1) acquire(L2) tmp:= a T2 a := b b := tmp release(L1) release(L2) acquire(L2) b.x ++ T3 release(L2)
Extending the happens-before relation to transactions • Happens-before in JMM: hb: Transitive closure of • Program orders of threads: p • Synchronizes-with: sw • release(l) sw acquire(l) • vol-write(v) sw vol-read(v) • fork(t) hb(action of t) • (action of t) hb join(t) • Transactions: • Two accesses a1 and a2 are race-free if they are both within transactions • Transaction t1 happens before transaction t2 iff t1 and t2 access a common variable • Transaction implementation provides list of accessed variables at commit point to race-checker/Java runtime
Extending the happens-before relation to transactions • Happens-before in JMM: hb • Transitive closure of • Program orders of threads: p • Synchronizes-with: sw • release(l) sw acquire(l) • vol-write(v) sw vol-read(v) • fork(t) hb(action of t) • (action of t) hb join(t) • Extended happens-before: ehb • Transitive closure of • JMM’s hb: hb • Transaction t1 ehb t2 iff • t1 and t2 access at least one common variable • Extended locksets:LS: (Variables) (Threads Locks Volatiles Data Variables TL) • “transaction lock”
t1 = new Foo() LS(o.data) = t1.data = 42 First access LS(o.data) = {T1} begin_tr T1 t1.nxt = head head = t1 (T1 LS) (add {o.nxt, &head} to LS) end_tr LS(o.data) = {T1, o.nxt, &head} begin_tr iter = head iter != null T2 iter.data = 0 iter = iter.nxt ({&head,o.data,o.nxt} LS ) (add T2 to LS) LS(o.data) = {T1, o.nxt, &head, T2} iter == null ({TL,T2} LS ) (No race) LS(o.data) = {TL, T2} end_tr (T2 LS) (add {&head,o.data,o.nxt} toLS) LS(o.data) = {TL, T2, &head, o.data, o.nxt} begin_tr t3 = head T3 head = t3.nxt ({&head, o.nxt} LS ) (add T3 to LS) LS(o.data) = {TL, T2, &head, o.data, o.nxt,T3} end_tr (T3 LS) (add {&head, o.nxt} to LS) LS(o.data) = {TL,T2, &head, o.data, o.nxt,T3} (T3 LS) (No race) t3.data++ LS(o.data) = {T3}
T1, acquire, l Global event list T2, vol-write, v T1, release, l T2, acquire, l x T1, vol-read, v T2, release, l y Implementation in Kaffe JVM • Naive implementation too inefficient acquire(l) by thread t For each variable d: if (l LS(d)) (add t to LS(d)) Implementation features • Implicit, shared representation of locksets • Use temporary locksets only at access • Lazy evaluation of locksets • Apply update rules at only variable access • Keep synchronization actions in a global event list • Order of events consistent with p and sw • Short-circuit checks before lockset computation • Handle thread-locality, same lock protectsobject twice in a row,...
New Implementation Features • Static analysis to rule out “definitely race-free” accessesto avoid runtime checks • Using variant of Chord [Naik et. al., PLDI ’06] • More short-circuit checks • Example: Direct ownership transfer from one thread to another • Synchronization events by other threads in the middle are irrelevant
Experimental Evaluation • Goldilocks compared with vs. other Java dynamic race detectors: • Racetrack [Yu et. al., SOSP ’05] • TRaDe [Christiaens et. al., ICCSS ’01] • IBM race detection tool [Choi et. al., PLDI ’02] • Benchmarks • Microbenchmarks: Interesting, artificial programs • Larger programs for performance comparison • Raja, SciMark, Grande
Experiments • Goldilocks performance competitive in all examples • Significantly better in some • With careful implementation, precision and handling of transactions can be implemented efficiently