140 likes | 259 Views
Hardware-based. Transactional Memory Supporting Large Transactions. Anvesh Komuravelli Abe Othman Kanat Tangwongsan. Concurrent Programs. handle with care . Thread 1. Thread 2. Deadlock. obj.x = 7; find_primes (); // intrusion test if ( obj.x != 7) fireMissiles () .
E N D
Hardware-based Transactional MemorySupporting Large Transactions AnveshKomuravelli Abe Othman KanatTangwongsan
Concurrent Programs handle with care Thread 1 Thread 2 Deadlock obj.x = 7; find_primes(); // intrusion test if (obj.x != 7) fireMissiles() lock_acquire(critical_zone); lock_release(critical_zone); do_stuff(); obj.x = 42; Starvation Complex Program Lock-based Approaches
Transactional Memory obj.x = 7; find_primes(); // intrusion test if (obj.x != 7) fireMissiles() x_begin(); x_finish(); Atomicity in the face of concurrency. Transactional Memory Consistency across the whole system. Isolation from other transactions. do_stuff(); obj.x = 42; Programmer: enclose instructions in a transaction. System: execute transactions concurrently, and if conflict, do something intelligent (e.g., abort, restart)
Different strokes for different folks Challenges & Opportunities Common Case: 98% transactions fit in L1 => hardware • Fast… Easy conflict detection… Easy commit and abort What to do with the rest 2%? Goal: Hide platform/resource limitations from programmers
VTM – Virtual Transactional Memory • On overflow, use process’s virtual memory • Tracking at cache-line granularity • Per process state (tag and store virtual addresses) • Flatten nested transactions • Implemented in specialized hardware (dedicated cache, search logic, …) • Drawbacks? • Modifications to hardware. Costly?
XTM – eXtended Transactional Memory • “Complete TM Virtualization without complex hardware” • Page table per transaction • Allows arbitrary nesting – no flattening • The only hardware support – raise an exception on overflow • Drawbacks? • Page granularity on overflows • Potentially higher memory usage than VTM • Software commit is costlier than VTM’s hardware commit – can stall other xactions of the process
An observation • Small transactions get things done in the hardware • Large transactions spill the buffers and TM switches to virtual mode • What about varyingly large transactions? • What if everything fits again in the buffers? • Can we switch back to hardware mode?
Towards improving virtualization • Permissions-only cache – reduces the chance of overflowing buffers significantly • At the cost of a little extra hardware • The already less frequent (assumed to be!) large transactions are even lesser • Large transactions are serialized and handled one-at-a-time.
Do we always have only a few large transactions? • For now: yes • In the future: maybe not • I/O and blocking system calls might wish to be atomic • How do the earlier discussed approaches fare? • VTM – complex hardware • XTM – complications with OS and page granularity • OneTM – can lead to starvation!
TokenTM • Uses tokens to monitor memory blocks • To read, you get a token • To write, you need to get every token • Rigorous bookkeeping – blocks are tracked in caches, memory and disk • Handles large transactions gracefully • Except for conflicts, transaction speed is unaffected by large transactions in other threads
TokenTM Downsides • Small transactions suffer(?) • L1 cache sized transactions can work at hardware speed….BUT: • Need flash-clear and flash-OR circuits in L1 cache • Requires a very involved ad hoc representation • …or taking a 3% overhead hit • Optimizes the rare large case to the detriment of the frequent small case?
Conclusion • Sun Research’s Transactional Memory Spotlight: More recent proposals for “unbounded” HTM aim to overcome these disadvantages, but Sun Labs researchers came to the conclusion that the proposals were sufficiently complex and risky that they were unlikely to be adopted in mainstream commercial processor designs in the near future.