250 likes | 377 Views
CAR-STM: Scheduling-based Collision Avoidance and Reduction for Software Transactional Memory. Shlomi Dolev, Danny Hendler and Adi Suissa PODC 2008. CAR-STM: rationale. “transaction ignorant” thread scheduling problematic TM scheduler handles transactional threads This permits:
E N D
CAR-STM: Scheduling-based Collision Avoidance and Reduction for Software Transactional Memory Shlomi Dolev, Danny Hendler and Adi SuissaPODC 2008
CAR-STM: rationale • “transaction ignorant” thread scheduling problematic • TM scheduler handles transactional threads • This permits: • Serializing contention management • Proactive collision avoidance
“Conventional” STM system high-level structure OS-scheduler-controlledapplication threads ContentionManager ContentionDetection TM System arbitrate Abort/retry/Wait proceed
CAR-STM's distinctive features Serializing contention management Serialize the execution of colliding transactions Proactive Collision avoidance Proactively assign transaction thread to core with “most conflicting’’ transactions based on application-provided information
Relying on (current) OS scheduling is problematic! OS scheduling of transaction threads: • Introduces pseudo-parallelism • Hurts TM performance stability/predictability • Does not allow proactive collision avoidance andserializing CM.
CAR-STM high-level architecture Transaction thread T-Info Dispatcher CollisionAvoider TQ thread TQ thread Serializing contention mgr. Transaction queue #k Transaction queue #1 Core #1 Core #k
TQ-Entry Structure Transaction thread T-Info Dispatcher CollisionAvoider wrapper method Transaction data TQ thread TQ thread T-Info Serializing contention mgr. Trans. thread Lock, condition var Transaction queue #k Transaction queue #1 Core #1 Core #k
Transaction dispatching process Enque transaction in most-conflicting queue. Put thread to sleep, notify TQ thread. 1 Call app-specific conflict probability method Dispatcher calls Collision Avoider 4 2 Call Dispatcher with an optional T-Info pointer argument 3 4
Transaction execution 2 TQ thread executes transaction TQ thread wakes-up transaction thread 1 3 wrapper method TQ thread Transaction data T-Info Trans. thread Lock, condition var TQ thread dequeues entry Transaction queue #i Core #i
Dispatcher / TQ-thread synchronization 1 When TQ is emptied, TQ thread goes to sleep Dispatcher When dispatcher adds a transaction, it wakes-up TQ thread 2 TQ thread Transaction queue #i Core #i
Serializing Contention Managers • When two transactions collide, fail the newer transaction and move it to the TQ of the older • Fast elimination of live-lock scenarios • Two SCMs implemented • Basic (BSCM) – move failed transaction to end of the other transactions' TQ • Permanent (PSCM) – Make the failed transaction a subordinate-transaction of the other transaction
PSCM 1 Transactions a and b collide, b is older TQ thread TQ thread Td Te Ta Tb Tc PSCM Transaction queue #1 Transaction queue #k Core #1 Core #k
PSCM TQ thread TQ thread Td Te Tb Ta Tc PSCM Ta Transaction queue #1 Transaction queue #k Tc Losing transaction and its subordinates are made subordinates of winning transaction Core #1 Core #k
Experimental evaluation • Incorporated CAR-STM within RSTM • Tested on an 8-way 4 x XEON-7110M server • Serializing CM tests: Workloads generated by STMBEench7[Guerraoui, Kapalka, Vitek, '07] • Proactive collision avoidance tested on synthetic app
STMBench7 • A benchmark for STM implementations • Generates realistic workloads representative of complex, object-oriented applications • Workloads composed of 45 operation types on a shared data structure • Operation categories • Long / short traversals • Short operations • Structure modification operations
Execution time: R/W dominated workloads Speed-up of between1.7 and 36 Reduction of standard deviation by factor of up to 40
Quiescence time:a measure of live-lock Speed-up of between11 and 118
Throughput: write dominated workloads Throughput increase of up to 15.7
Experimental evaluation: proactive collision avoidance • RegionedArray (RA) synthetic app (read, write, delete) • Each thread runs for 20 seconds • Randomly select region • Randomly select transaction length • Randomly select operation • Transaction repeatedly applies operation to randomly-selected region item Transactional memory Dagstuhl, June 08
Experimental results Transactional memory Dagstuhl, June 08
Most relevant prior art • [Yoo, Lee, 2008]: Adaptive transaction scheduling for TM systems • [Bai, Shen, Zhang, Scherer, Ding, Scott]: A key-based adaptive TM executor
Conclusions • Transactions-ignorant scheduling is problematic • Serializing contention management eliminates live-lock STM behavior • Proactive Collision avoidance contribution application-dependent • Some future work directions • Robust scheduling • Transaction-aware OS scheduling • Better handling of page faults, local data access,…