570 likes | 709 Views
Virendra J. Marathe , William N. Scherer III, and Michael L. Scott Department of Computer Science University of Rochester. Design Tradeoffs in Modern Software Transactional Memory Systems. Presented by: Armand R. Burks Fall 2008.
E N D
Virendra J. Marathe, William N. Scherer III, and Michael L. Scott Department of Computer Science University of Rochester Design Tradeoffs in Modern Software Transactional Memory Systems Presented by: Armand R. Burks Fall 2008
Classic lock-based implementations of concurrent objects suffer from several important drawbacks: • Deadlocks • Priority inversion • Convoying • Lack of fault tolerance Background info.
Increased interest in nonblocking synchronization algorithms • Failure of a thread can never prevent the system from making forward progress Background info. Cont’d.
A concurrent object is a data object shared by multiple threads of control within a concurrent system. Definition...
Section 1 – Introduction Section 2 – DSTM Section 3 – FSTM Section 4 – Comparative Evaluation Section 5 – Related Work Section 6 - Conclusions Outline
Section 1: Introduction
STM – an approach to construct nonblocking objects • Simplifies task of implementing concurrent objects • Software Transactional Memory (STM) is a generic non-blocking synchronization construct that enables automatic conversion of correct sequential objects into correct concurrent objects. Intro: Software Transactional Memory
Transaction • A finite sequence of instructions (satisfying the linearizability and atomicity properties) that is used to access and modify concurrent objects Definition...
This paper focuses on discussing two approaches to STM • Compares and evaluates strengths and weaknesses of the approaches Purpose of Paper
Section 2: Dynamic Software Transactional Memory
Definition... • DSTM is a low-level application programming interface (API) for synchronizing shared data without using locks. (Herlihy) • Uses early release ( a transaction can drop a previously opened object) • Invisible reads • Example follows... Dynamic Software Transactional Memory (DSTM)
DSTM Cont’d. Committed Transaction Start Transaction X New Object Shared Object-New Version Copy Old Object TMObject Old Locator Locator Atomic Compare & Swap (CAS) Shared Object-Old Version CAS FAILURE Other transaction has acquired object New Active Transaction Transaction New Object Old Object Shared Object-New Version New Locator Opening a TMObject (in write mode) recently modified by a committed transaction
Most recent previous transaction may still be ACTIVE • Wait/retry, abort, or force competitor to abort • Ask contention manager to make decision DSTM Contention
Full acquire operation • Unnecessary contention between transactions • How do we avoid this contentention? • Each transaction maintains private (read-list) of objects opened in read-only mode • Must recheck validity before committing DSTM- What about read-only?
What about stale data (from a not yet committed transaction)? • May lead to unwanted behavior ( addressing errors, infinite loops, division by zero, etc.) • Transaction would have to revalidate all open read-only objects • Current version does this automatically when opening DSTM Read-only contd...
Section 3: Fraser’s Software Transactional Memory (FSTM)
Named after Keir Fraser (Cambridge) • Unlike DSTM, FSTM is lock-free • Guarantees forward progress for system (within a bounded number of steps, some thread is guaranteed to complete a transaction) • How? • Recursive helping FSTM
When transaction detects conflict with another, it uses the conflicting transaction’s descriptor to make updates, then aborts/restarts itself. • Example follows: FSTM – Recursive helping
FSTM- Recursive helping cont’d... A A B B A Concurrent Object Abort & Restart A COMMITTED COMMITTED
Basic Transactional Memory structure New data Next handle Object ref Old data Status UNDECIDED Read-only Read-write Object Handles Transaction Descriptor Concurrent Object Shadow Copy Object Header Unlike DSTM, multiple transactions can open the same object in write mode because each has its own shadow copy.
UNDECIDED • ABORTED • COMMITTED • READ-CHECKING FSTM- Transaction states
UNDECIDED • Transaction opens object headers while in this state • Open is not visible to other transactions • COMMITTED • If open by another transaction, will be detected • If conflict is detected, the transaction uses recursive helping FSTM- Transaction states cont’d...
Transaction acquires all objects it opened in write mode in some global total order (virtual address) • Uses atomic Compare and Swaps • Each CAS swings object header’s pointer to transaction descriptor • If pointer already points to another transaction’s descriptor, conflict is detected • Example follows Commit phase
CAS Conflict discovery Object ref Old data Status UNDECIDED Read-only Read-write Object Handles Transaction Descriptor Concurrent Object Shadow Copy CAS Object Header Recursively help competitor
Only proceeds if competitor precedes in in global total order (thread/transaction id) • If not, competitor will be aborted • Also, competitor must be in READ-CHECKING state (for validating) Recursive helping
Validates objects in read-only list • Verify object header still points to version of object as when handle was created • If not, abort • If pointing to another transaction, recursively help • After successful validation, switch to COMMITTED • Done automatically by transaction • Release object by swinging handle to new version REAd-checking state
Section 4: Comparative Evaluation
Both DSTM and FSTM have significant overhead • Simpler than previous approaches • This section does the following • Highlight design tradeoffs between the two • Highlight impact on performance of various concurrent data structures Comparative evaluation
16-processor SunFire 6800 • Cache-coherent multiprocessor • 1.2 GHz Processors • Environment – Sun’s Java 1.5 beta 1 HotSpot JVM • Augmented with JSR166 update from Doug Lea Experimental environment
Simple Benchmarks • A stack • 3 variants of a list-based set • More Complex Benchmark • Red-black tree Testing benchmarks
In list and Red-black tree benchmarks • Repeatedly/Randomly insert/delete integers 0-255 • Keeping this range small increases probability of contention • Measured total throughput over 10 seconds • Vary number of worker threads between 1-48 • Six test runs Testing benchmarks cont’d...
DSTM • Transaction acquires exclusive access when it first opens it for write access • Makes transaction visible to potential competitors early in its lifetime (eager acquire) • FSTM • Transaction acquires exclusive access only in commit phase • Makes transaction visible to potential competitors later in its lifetime (lazy acquire) Object acquire semantics
Eager acquire • Enables earlier detection & resolution of conflicts • Lazy acquire • May cause transactions to waste computational resources on doomed transactions before detecting a conflict • Tends to reduce amount of transactions identified as competitors • If application semantics allow both transactions to commit, lazy acquire may result in higher concurrency Object acquire semantics
Object acquire semantics – performance results Red-black Tree Performance Results
FSTM outperforms due to lazy acquire semantics Object acquire semantics – performance results
Object acquire semantics – performance results Number of Contention Instances
FSTM outperformed again • Difference remained more or less constant Object acquire semantics – performance results
DSTM has extra level of indirection (Fig 1. & 2) • TMObject points to Locator which points to object data • FSTM object header points directly to object data • Extra indirection may result in slower reads/writes • Slower transactions (particularly if most are read-only) Bookkeeping and indirection
Indirection eliminates overhead of inserting object handles into descriptor chains • Transactions with large number of writes may be faster in DSTM • Lazy acquire requires extra bookkeeping Bookkeeping and indirection
Testing Benchmarks • Stack, • IntSet, • IntSetRelease Bookkeeping and indirection Performance
Bookkeeping and indirection Performance Results Stack Performance Results
Stack illustrates very high contention (top-of-stack pointer) • DSTM outperformance • Due to FSTM extra bookkeeping in write mode Bookkeeping and indirection Performance Results
IntSet maintains sorted list • Every insert/delete opens (write mode) all objects from beginning of list • Successful transactions are serialized Bookkeeping and indirection intset
Bookkeeping and indirection Performance Results IntSet Performance Results
Higher throughput for DSTM • FSTM suffers extra bookkeeping overhead, sorting overhead, and extra CASes Bookkeeping and indirection Performance Results
IntSetRelease – variant of IntSet • Only the object to be modified is opened in write mode • All others opened temporarily in read mode • Then either released (moving on to next node in list) • Or upgraded to write mode Bookkeeping and indirection intsetrelease
Bookkeeping and indirection Performance Results IntSetRelease Performance Results
FSTM more than a factor of 2 better • DSTM’s indirection is to blame Bookkeeping and indirection Performance Results
Invisible reads & lazy acquire (FSTM) may allow transaction to enter inconsistent state during execution • This may lead to memory access violations, infinite loops, arithmetic faults, etc. • DSTM- performs incremental validation at open time • FSTM- proposes mechanism based on exception handling (catch problems when they arise rather than prevent them) Transaction validation
On a memory access violation, exception handler validates the transaction that caused the exception • Responsibility of detecting other inconsistencies is left to the programmer Transaction validation - fstm