280 likes | 488 Views
Adaptive Software Transactional Memory. by Virendra J. Marathe William N. Scherer III Michael L. Scott University of Rochester. Talk Outline. Software Transactional Memory (STM) Design space of STMs Explored four design dimensions Adaptive STM Adapts behavior in two dimensions
E N D
Adaptive Software Transactional Memory by Virendra J. Marathe William N. Scherer III Michael L. Scott University of Rochester
Talk Outline • Software Transactional Memory (STM) • Design space of STMs • Explored four design dimensions • Adaptive STM • Adapts behavior in two dimensions • Based on underlying workload • Experimental Results • Conclusion
Transactional Memory • Multi-core architectures are the future • Concurrent programming will become common • Lock-based programming is hard • Transactional Memory (TM) simplifies concurrent programming significantly
Software Transactional Memory (STM) • Transaction: A sequence of instructions, executed atomically • In STM, transactions run in software • Requires hardware support for simple atomic instructions (e.g. compare-and-swap) • Example: DSTM (Herlihy et. al.) and OSTM (Fraser and Harris)
STM Mechanics • Transactional Memory Objects, TMObjects (courtesy DSTM), as wrappers of shared objects • Transactions – • ACTIVE, ABORTED, or COMMITTED state • Open TMObjects (read or write mode) • Acquire TMObjects to be changed • Atomically commit updates • Abort contenders • Read Sharing: No TMObject acquisition • Optimization: Release opened TMObjects early
STM Design Space - I • Several design dimensions of recently proposed STMs • Acquire Semantics – when to acquire? • Eager Acquire – acquire TMObjects at open time • Early detection of conflicts • Lazy Acquire – acquire TMObjects at commit time • Reduce window of contention • Incurs extra bookkeeping and transaction validation cost
STM Design Space - II • Object Referencing Style: Direct and Indirect Direct Referencing Indirect Referencing TMObject TMObject Data Indirection Object Used to acquire objects Old Data New Data
STM Design Space - III • Metadata Structure: What does an acquired TMObject look like? Per TMObject Metadata Per TMObject Metadata Per Transaction Metadata TMObject1 TMObject2 TMObject1 TMObject1 TMObject2 TMObject2 Indirection Indirection Indirection Indirection Transaction New & Old Data TMObject1 New & Old Data TMObject1 New & Old Data TMObject2 New & Old Data TMObject2 New & Old Data TMObject1 New & Old Data TMObject2 Transaction Transaction
STM Design Space - III • Metadata Structure: What does an acquired TMObject look like? Per Transaction Metadata TMObject1 TMObject2 • TMObject lookup is expensive • Need to release acquired objects • - Release requires N extra CASes Transaction New & Old Data TMObject1 New & Old Data TMObject2
STM Design Space - III • Metadata Structure: What does an acquired TMObject look like? Per TMObject Metadata Per TMObject Metadata TMObject1 TMObject1 TMObject2 TMObject2 • Need not release acquired objects • Extra indirection overhead if not • acquired objects released Indirection Indirection Indirection Indirection New & Old Data TMObject1 New & Old Data TMObject1 New & Old Data TMObject2 New & Old Data TMObject2 Transaction Transaction
STM Design Space - IV • Progress Conditions • Obstruction freedom: Transactions make progress in isolation • Admits livelocks, need contention management • Lock freedom: At least one transaction makes progress • Requires additional overhead e.g. ordering, helping
STM Design Space: The Big Picture obstruction-free lock-free • White areas have nothing in particular to recommend them • Grey areas seem like distinctly bad ideas • ASTM adapts across entire quadrant DSTM ASTM indirect direct indirect per-transaction per-object OSTM eager lazy eager
Adaptive STM Design Choices • Obstruction freedom • Simple and efficient • Per object metadata • Reduced-cost object cleanup • Adaptive object referencing style • Direct Indirect • Adaptive acquire semantics • Eager Lazy
Adaptation I: Object Acquisition Direct access for reader transactions TMObject Data ACTIVE Writer Transaction
Adaptation I:Object Acquisition TMObject Data Copy ACTIVE Writer Transaction Transaction Old Version New Version Indirection Object Data Replica (DSTM style)
Adaptation I:Object Acquisition Direct to Indirect Object Referencing Style Transition TMObject Data Copy CAS ACTIVE Writer Transaction COMMITTED Writer Transaction Transaction Old Version New Version Indirection Object Data Replica
Adaptation I:Reading an Acquired Object Indirection in accessing the data object TMObject ACTIVE Transaction COMMITTED Transaction Transaction Old Version New Version Old Data Indirection Object New Data ACTIVE Reader Transaction
Adaptation I:Reading an Acquired Object Indirection in accessing the data object TMObject ACTIVE Transaction COMMITTED Transaction Transaction CAS Old Version New Version Old Data Indirection Object New Data ACTIVE Reader Transaction
Adaptation II:Acquire Semantics • Eager acquire usually wins over lazy acquire • Exception: transactions that intersperse many early releases between writes • Window of contention larger with eager acquire than with lazy acquire semantics
Adaptation II:Adaptive Acquire • An acquire adaptation heuristic: • Defaults to eager acquire • Maintain history of transactional accesses • If large number of early releases between writes, do lazy acquire the next time
Experimental Setup • Implementation in Java 5 • 16-processor SunFire 6800, a cache coherent multiprocessor with 1.2 GHz UltraSPARC III processors • DSTM code borrowed from collaborators at Sun Labs • Local implementations of all other STMs • Several microbenchmarks to evaluate performance on different workloads
Write Dominated Workloads IntSet – sorted set of integers • Transaction • inserts or • deletes node • Each object • is opened in • “write” mode • Eager Acquire • wins by a big • margin • Too much • bookkeeping • and validation • overheads • with Lazy • Acquire Eager STMs Lazy STMs
Read Dominated Workloads RBTree – a concurrent red-black tree • Transaction • inserts or deletes • node • Only updated • nodes opened • in write mode • DSTM performs • worst due to • its extra level of • indirection • ASTM readers • eliminate this • indirection Other STMs DSTM
Lazy Acquire and Early Release RandomGraph – a random graph • Graph • implemented as • a 2-D linked list • Transaction • inserts or • deletes node • Unaffected • nodes are early • released • Lazy Acquire • wins • ASTM catches • up with OSTM • and lazyASTM Lazy STMs Eager STMs
Conclusion • Current STM designs are workload sensitive • Adaptivity makes performance robust across different workloads • Study of four STM design dimensions • Adaptivity on two dimensions • Acquire Semantics • Object Referencing Style • More dimensions of adaptivity need to be explored; e.g. visible vs. invisible reads
Write Dominated Workloads - II LFUCache – Least Frequently Used Page Replacement • A priority • queue heap • that simulates • LFU page • replacement • Pages are • picked • randomly from • a Zipf • distribution • Eager Acquire • wins by a big • margin • Too much • bookkeeping • overhead • with Lazy Acquire
Read Dominated Workloads - I IntSetRelease – sorted set of integers, with early release • Transaction • inserts and • deletes nodes • Each object • is opened in • “read” mode • and released • early when • traversing to the • next node