140 likes | 262 Views
Concurrent programming for dummies (and smart people too). Tim Harris & Keir Fraser. Example: hashtable. Where should the locking be done?. Hashtable object. Chains of key,value pairs. Array of buckets. 17 13 11. 7. 5 3 2. Example: single-cell buffer.
E N D
Concurrent programming for dummies (and smart people too) Tim Harris & Keir Fraser
Example: hashtable • Where should the locking be done? Hashtable object Chains of key,value pairs Array of buckets
17 13 11 7 5 3 2 Example: single-cell buffer • Methods should be marked as synchronized • ‘wait()’ can wake up spuriously so must be in a loop • ‘notifyAll()’ should be used in place of ‘notify()’ for liveness { if (this.full) wait(); this.full = true; this.val = val; notify(); } { int result; if (!this.full) wait(); result = this.val; this.full = false; notify(); return result; } void put(int val) int get()
Conditional critical regions in Java void put(int val) { atomic (!this.full) { this.full = true; this.val = val; } } int get() { int result; atomic (this.full) { this.full = false; return this.val; } } • Basic syntax: ‘atomic (cond) { statements; }’ • Execute the statements exactly once… • …starting in a state where the condition is true • The statements can access fields & local variables, invoke methods, instantiate objects etc.
Implementation overview Source code Bytecode + extended attributes Software transactional memory operations Machine code instructions
Implementation overview (ii) • Native STM interface: • Transaction management • void STMStartTransaction(void) • boolean STMCommitTransaction(void) • void STMAbortTransaction(void) • Blocking • void STMWait(void) • Data access • word_t STMReadValue(addr_t a) • void STMWriteValue(addr_t a, word_t w) Exposed as static methods Called from interpreter / JIT’d code
Ownership records (orecs) Proposed updates a1: 100 version 42 Status: ACTIVE 200 a5: a1: (100,42) -> (777,43) a5: (200,17) -> (888,18) version 17 Heap structure Data storage
CAS: active → committed CAS: t1 → 43 CAS: 42 → t1 CAS: 17 → t1 CAS: t1 → 18 Non-contended updates • Acquire exclusive access to each ownership record needed • Check that they hold the correct versions • Set status to committed/aborted • Make updates to the heap (if needed) • Release ownership records, updating the versions t1: version 43 a1: version 42 100 777 Status: ACTIVE Status: COMMITTED a1: (100,42) -> (777,43) a5: (200,17) -> (888,18) 200 a5: 888 version 18 version 17
Contended updates • Simple option: • Spin waiting for the owner to make its updates and release • Obstruction-free option: • Make updates on owner’s behalf and then releases ownership • Intricate: first thread may make updates at a later stage. Introduces ‘active updaters’ count into each orec – details in the paper • Hacky option: • Suspend the current owning thread • Make their updates • Revoke their ownership • Change their PC to be outside the commit operation • Resume the thread
atomic {…} util.concurrent java.util Compound swaps 27 37 17 μs per operation #CPUs (1 thread per CPU)
atomic {…} util.concurrent java.util Compound swaps (ii) μs per operation #CPUs (1 thread per CPU)
Memory management • Two problems: management of transaction descriptors & management of shared data structures reachable from them • So-called ‘A-B-A’ problems occur in most CAS-based systems • We’ve looked at a number of schemes: • Safe memory re-use (Michael) • Repeat offender problem (Herlihy et al) • Reference counting • Epoch-based schemes • In many cases we’re really allocating fresh pointers rather than needing ‘more’ memory a1: a5:
Memory management (II) • Our more recent STM introduces a Hold/Release abstraction: • A Hold operation acquires a revocable lock on a specified location • A Release operation relinquishes such a lock • A lock is revoked by a competing hold or by another thread writing to a held location • Revocation is exposed by displacing the previous owner to a specified PC • This lets us ensure only one thread at a time is working on a given transaction descriptor – MM much simplified • Software implementation using mprotect or /proc a1: a5:
Future directions • Evaluation beyond synthetic benchmarks • Try the C STM interface yourself: download under a BSD-style license from http://www.cl.cam.ac.uk/netos/lock-free • Reflective exposure of a transactional API • Create, enter, leave transactions • Possibly enables better I/O handling • Opportunities for new hardware instructions