190 likes | 366 Views
Transactional Memory: How to Perform Load Adaption in a Simple And Distributed Manner Johannes Schneider David Hasenfratz Roger Wattenhofer. Without easy and efficient parallel programming methods…. “computer science will become washing machine science.“.
E N D
Johannes Schneider Transactional Memory: How to Perform Load Adaption in a Simple And Distributed Manner Johannes Schneider David Hasenfratz Roger Wattenhofer
Johannes Schneider Without easy and efficient parallel programming methods… “computer science will become washing machine science.“
Johannes Schneider How to handle access to shared data? • Locks, Monitors… • Coarse grained vs. fine grained locking easy but slow program demanding, time consuming but fast programs • Problems • difficult • error prone • Composability • … Thread 1 Thread 2 lock all data modify/use data unlock all data lock A lock B modify/use A,B lock C modify/use A,B,C unlock A modify/use B,C unlock B,C lock B lock A modify/use A,B unlock A,B Only 1 thread can execute Deadlock!
Johannes Schneider Transactional memory(TM) - a possible solution Begin transaction modify/use data End transaction • Simple for the programmer • Composable • Idea from database community • Many TM systems (internally) still use locks • But the TM system (not the programmer) takes care of • Performance • Correctness (no deadlocks...) Method A.x() Begin Transaction B.y() … End Transaction Method B.y() Begin transaction … End transaction
Johannes Schneider Transactional memory systems • If transactions modify different data, everything is ok • the same data, conflicts arise that must be resolved • Transactions might get delayed or aborted • Job of a contention manager • A transaction keeps track of all modified values • It restores all values, if it is aborted • A transaction successfully finishes with a commit
Johannes Schneider Conflicts – A contention manager decides • Abort or delay a transaction, i.e. adapt load • Distributed • Each thread has its own manager • Example • Initially: A=1, B=1 Manager 1 Manager 2 Manager 2 Manager 1 Trans.1 Trans. 2 Trans. 1 Trans. 2 T1 T1 T1 B:=2 … A:=3 … … A:=2 … B:=2 … A:=3 … … A:=2 … conflict conflict Abort (undo all changes, i.e. set A:=1) and restart (after a while) Abort (set B:=1) and restart OR wait and retry Delay to adapt load!
Johannes Schneider Prior work • Contention Managers[PODC03,PODC05,ISAAC09…] • System load was not (explicitly) considered • Load adaption (based on contention) • Estimate contention intensity: CI [SPAA08] • If abort: CI = a CI + (1-a) with parameter a [0,1] • If commit: CI = a CI • If CI > parameter b then resort to central scheduler • Keep a transaction queue per core [PODC08] • Central dispatcher assigns transactions to a core, i.e. its queue • Each core iteratively executes transactions from queue • If transaction A on core 1 is aborted due to B on core 2 then A is appended to the queue of core 2 • Central scheduler will become a bottleneck D C Core 2 Core 1 A B B aborts A A D Core 2 Core 1 C B
Johannes Schneider This paper • Theoretical analysis • Decentralized (simple) approaches to load adaption • based on contention
Johannes Schneider Conflict graph A conflicted with C D conflicted with B Strategies A • Ignore: Do not learn from conflicts • ImmediateRestart • Stay real: Remember faced conflicts • SerializeFacedConflicts • Do not schedule prior conflicting transactions concurrently • Be cautious: Assume additional conflicts • SerializeAll • All transactions in a subgraph are assumed to conflict B C D B A C D B A A B D C C D
Johannes Schneider Load Adaption Strategies • AbortBackoff • If aborted wait for a random time [0,2#aborts] • Priority = number of aborts #aborts • Who wins a conflict? • 2 strategies • Estimate the work done • Unrelated to work done
Johannes Schneider Theory Part - Model • n transactions (and threads) • Start concurrently on n cores • Transaction • sequence of operations • operation takes 1 time unit • duration (number of operations) tTis fixed • 2 types of operations • Write = modify (shared) resource and lock it until commit • Compute/abort/commit • Ignore overhead of load adaption • Remembering transactions, scheduling… Core 1 Core 2 Core n … A Z B A
Johannes Schneider Moderate parallelism • Shared counter • Conflicts directly after transaction start • Linked List • Conflicts at arbitrary time • Expected time span until all transactions committed • Speed-up log n (at best) Transaction run time #transactions
Johannes Schneider Substantial parallelism T1 • Worst case • Conflict graph is d-ary tree of logarithmic height • Exponential gap in worst case • SerializeAll and others T2 T3 … T5 T4
Johannes Schneider Practical investigation • Remembering conflicts causes too much overhead • Good for analysis but not for implementation • Quickadapter • Serializes transactions • Each core has a “waiting” flag • If aborted, set flag and wait until flag unset • If commit, unset some flag • AbortBackOff • (Also considered some variants)
Johannes Schneider Practical investigation • Evaluation on 16 core machine • DSTM2 system • Visible readers • Six benchmarks • Little parallelism • Shared counter, Sorted List (accessed objects not released), Listcounter • Considerable parallelism • Red Black Tree, LFUCache, RandomAccessArray • Compare new load adaption policies to existing contention managers
Johannes Schneider Discussion • Hard to keep maximum throughput, also in [SPAA08, PODC08] • Even without conflicts • Improvement for 1 benchmark worsens another • On average better than schemes without load adaption
Johannes Schneider Conclusion • Simple and distributed load adaption strategies • Theory • (For now) constants and parameters matter a lot • Practice • Hard to keep load at peak for all usage patterns
Johannes Schneider Thanks for your attention! Questions? ??? \vspace{10pt}
Johannes Schneider Analysis AbortBackoff for counter • Recall: If aborted wait for a random time [0,2#aborts] • Assume #aborts ~ log (ntT) + x (for some x) • Define: a(x) := fraction of active nodes • a(0) = 1 (after time ~2log (ntT) = ntTa constant fraction still active) • Chance conflict for interval [0,2#aborts] Interval [0, 2log(ntT)+x ] ~ a(x) ntT/ 2log (ntT) +x = a(x) /2x • a(x+1) = a(x)/2x = 1/2∑i=0..x i~ 1/2x2 • a(√log n) = 1/2(√log n)2 = 1/n • ∑i=0.. log (ntT) +√log n length interval = ∑i=0.. .. log (ntT) +√log n 2i = ntT 2√log n+1 T3 T2 T1 a(x)ntT= 3/n ntT= 3tT