250 likes | 265 Views
This paper explores the concept of speculative locking as a way to improve performance in distributed systems by removing blocking lock acquisitions. It discusses the benefits and challenges of speculative locking and provides a case study in the JavaSplit distributed runtime system.
E N D
Optimistic Concurrency for Clustersvia Speculative Locking Michael Factor (IBM IL) Assaf Schuster (Technion) Konstantin Shagin (IBM IL) Tal Zamir (Technion)
Motivation • Locks are commonly used in distributed systems • Protect access to shared data • Coarse-grained/fine-grained locking • Blocking lock acquisition operations are expensive • Require cooperation with remote machines • Obtain ownership, memory consistency operations, checkpointing • On the application’s critical path • Reduce concurrency • Hence, removal of blocking lock acquisitions has potential to boost performance
Speculative locking Speculative Locking (SL) suggests that the application should continue its execution without waiting for a lock acquisition to be completed Such execution may cause data consistency violations A thread may access invalid data Consistency violations are detected by the system and a rollback is performed if such a violation is detected Thus, speculative locking Removes the lock acquisition from the critical path Allows concurrent execution of critical sections protected by the same lock In a number of cases speculative locking is especially advantageous Little data contention Threads access different data Threads rarely perform write operations Little lock contention Conservative thread-safe libraries using coarse-grained locks
Blocking lock acquisition vs. speculative B A A B spec. acq (L) acq(L) spec. acq(L) write(Y) write(Y) request(L) request(L) executing in speculative mode blocked rel (L) rel (L) read(X) ownership(L) ownership(L) conflict check read(X) blocking speculative
Speculative locking for a general-purpose distributed runtime We suggest employment of SL in a general-purpose distributed runtime system Since rollback capability is required, it is best suited for fault-tolerant systems We implement and evaluate SL in JavaSplit, a fault-tolerant distributed runtime system for Java The protocol consists of three main components Acquire operation logic Data conflict detection Checkpoint management Although we demonstrate distributed speculative locking in a specific system, the approach is generic enough to be applied to any other distributed runtime with rollback capabilities.
JavaSplit overview • Executes standard multithreaded Java programs • Each application thread runs in a separate JVM • The shared objects are managed by a distributed shared memory protocol • The memory model is Lazy Release Consistency • The protocol is object-based • Can tolerate multiple concurrent node failures • Thread checkpoints and shared objects are replicated • Employs bytecode instrumentation to distribute threads, preserve memory consistency and enable checkpointing. Hence it is: • Transparent: the programmer is not aware of the system • Portable: executes on a collection of JVMs
Blocking Lock Acquisition in JavaSplit • The requester sends a lock acquisition request to the current lock owner and waits for a reply • When the owner releases the lock, it takes a checkpoint of its state and transfers the lock • A checkpoint is required to support JavaSplit’s fault-tolerance scheme • Along with the lock, write notices are transferred, invalidating objects whose modifications should be observed by the acquirer (to maintain LRC)
Speculative Lock Acquisition in JavaSplit • The requester sends a lock acquisition request to the current lock owner and continues execution • Until lock ownership is received, the requester is in speculative mode • While in speculative mode, object read accesses are logged (using Java bytecode instrumentation) • When lock ownership is received, write notices are examined to detect a data conflict • Upon data conflict detection, the thread rolls back • A thread can speculatively acquire a number of locks
Blocking lock acquisition vs. speculative blocking speculative
Managing consistent checkpoints (theory) Requirement #1: Must ensure a thread can rollback to a state preceding any potentially conflicting access Since a conflicting access can be may occur only in speculative mode, it is sufficient to guarantee there is a valid checkpoint taken before speculating In a fault-tolerant system, each thread has at least one valid checkpoint preceding speculation It remains to ensure this checkpoint is not made invalid by checkpoints taken while in speculation mode Requirement #2: Must prevent the speculating thread from affecting other threads or monitor such dependencies If a speculating thread affects another thread, then the latter must be rolled back along with the former in the case of a data conflict
In JavaSplit, a thread can rollback only to the most recent checkpoint In addition, a thread has to checkpoint (only) before transferring lock ownership to another thread Consequently, in order to ensure a speculating thread can rollback to a state preceding speculation we prevent threads from transferring lock ownership while in speculative mode This satisfies the requirement #1 The transfer of lock ownership is postponed until the thread leaves the speculative mode Coincidentally, this also satisfies the requirement #2, because in JavaSplit, a thread may affect other threads only when transferring lock ownership Thus, a speculating thread cannot affect other threads Managing consistent checkpoints (practice)
Speculative Deadlocks A B • Two threads concurrently acquire locks from each other • The threads enter speculative mode and therefore cannot transfer lock ownership • No application deadlock • A number of threads can be involved in such a dependency loop • Deadlock detection and recovery are required
Speculative Deadlocks (2) • Deadlock Avoidance • Limiting the number of speculations per session • Returning lock ownership to passive home nodes on release • Deadlock Detection • Timeout-based approach • Simple but inaccurate • Message-based approach • Similar to Chandy-Misra-Haas edge-chasing algorithm • Can be used to automatically detect a more accurate timeout value • Deadlock Recovery • Thread rolls back to its latest checkpoint • Before continuing execution, all pending lock requests are served • Heuristic: next few lock acquisitions are non-speculative
Speculation Preclusion The protocol allows to acquire lock speculatively or the old fashioned blocking way Threads can decide at run time whether to apply speculation in each particular instance This enables run time heuristic logic that Detects acquire ops. that are likely to result in a rollback, and Prevents speculative acquisitions For a time period In a number of next acquire operations The speculation preclusion algorithm should be as lightweight as possible because it is invokedon each lock acquisition Static analysis can also be employed to detect cases in which rollback is likely to occur
Transactional Memory Transactional memory is another form of optimistic concurrency control akin to speculative locking The basic principle is the same: optimistically execute a critical code segment determine whether there have been data conflicts roll back in case validation fails However, the programming paradigm and the implementation details differ significantly
Distributed Transactional Memory Non-scalable/somewhat inefficient implementations (drawn from the classic hardware/software transactional memory protocols): Broadcast based Centralized Require global cooperation Less optimistic than distributed speculative locking Try to ensure validity of one transaction before executing another, hence: Fail to overlap communication with computation Can be classified to blocking-open and blocking-commit schemes Blocking-open: schemes that often induce a remote blocking open operation when accessing an object for the first time within a transaction Blocking-commit: schemes that require a blocking remote request prior to committing Question: Is SL a suitable alternative for TM in a distributed setting? It is surely more optimistic, but speculative deadlock and other factors may hinder this advantage We only scratch the surface of this question
Performance Evaluation Test bed: Cluster of 15 commodity workstations Workstation parameters Windows XP SP2 Intel Pentium Xeon dual-core 1.6 GHz processor 2GB RAM 1 Gbit Ethernet Each station executes two JavaSplit threads Standard Sun JRE 1.6
Traveling Salesman Problem • Single lock application • Lock protects the globally known minimal path • Non-speculative system has scalability issues due to lock acquisition operations • Speculative system does not need to wait for lock acquisitions, but has a slight overhead due to speculation failures (data conflicts) • With a single thread, the read access monitoring overhead results in a slowdown
SPECjbb* (low acquisition frequency) • Business server middleware application • Workers process incoming client requests, each requiring exclusive access to specific stock objects • Fine-grained locking • Job processing time is non-negligible (low lock acquisition frequency) • Non-SL system does not have a scalability issue • SL removes the lock acquisition overhead • No speculative deadlocks • No data conflicts
SPECjbb* (high acquisition frequency) • Job processing time minimized to simulate extremely high lock acquisition frequency • In the non-SL system, the lock acquisition overhead is high • SL has to set a quota on the number of speculations per speculative session, to avoid speculative deadlocks • SL reduces the lock acquisition overhead, but does not eliminate it here
HashtableDB • Naive benchmark demonstrating an optimal scenario for SL • A single lock protects a hash table object • Coarse-grained locking • Processing is performed while holding the lock • Non-speculative system does not scale, only one thread can perform computation at any given moment • SL system has near full linear scalability
Speculative Deadlocks • In the presence of speculative deadlocks, setting a quota on the number of speculations per session is required • The probability of speculative deadlocks is a complex function of multiple factors, also effected by this quota • The higher the quota, the more time can be spent in speculative mode, increasing deadlock probability
Speculative Deadlocks (2) • However, a higher speculation quota allows the application to continue execution (without blocking) for longer periods of time • Thus, per execution, an optimal value of speculation quota exists
Questions? The end