E81 CSE 532S: Advanced Multi-Paradigm Software Development

E81 CSE 532S: Advanced Multi-Paradigm Software Development A Concurrency and Synchronization Pattern Language Chris Gill and Venkita Subramonian Department of Computer Science and Engineering Washington University, St. Louis cdgill@cse.wustl.edu

Main Themes of this Talk • Review of concurrency and synchronization • Review of patterns and pattern languages • Design scenario case study • A peer-to-peer client-server application • Increasing liveness with concurrency patterns • Maintaining safety with synchronization patterns

Concurrency • Forms of concurrency • Logical (single processor) parallelism • Physical (multi-processor) parallelism • Goal: Increase Liveness • Progress is made: deadlock is avoided • Also, avoid long blocking intervals if possible • Progress is also optimized where possible • Independent activities logically parallel • Ideally, physically parallel if you can afford additional resources • Full utilization of resources: something is always running • E.g., both I/O bound and compute bound threads in a process • The most eligible task is always running • E.g., the one with the highest priority among those enabled

Concurrency, Continued • Benefits • Performance • Still make progress if one thread blocks (e.g., for I/O) • Preemption • Higher priority threads preempt lower-priority ones • Drawbacks • Object state corruption due to race conditions • Resource contention • Need isolation of inter-dependent operations • For concurrency, synchronization patterns do this • At a cost of reducing concurrency somewhat • And at a greater risk of deadlock

Synchronization • Forms of synchronization • Thread (lightweight) vs. process (heavy) locks • Why is process locking more expensive? • Binary (mutex) vs. counted (semaphore) locks • What is being locked • Scopes: scoped locking • Objects: thread-safe interface • Role-based: readers-writer locks • “It depends”: strategized locking • Goal: Safety • Threads do not corrupt objects or resources • More generally, bad inter-leavings are avoided • Atomic: runs to completion without being preempted • Granularity at which operations are atomic matters

Synchronization, Continued • Benefits • Safety • Prevent threads from interfering in bad ways • Control over (and reproducibility of) concurrency • Restrict the set of possible inter-leavings of threads • Drawbacks • Reduces concurrency (extreme form: deadlock) • Adds overhead, increases complexity of code • Need to apply locking carefully, with a purpose • Only lock where necessary • Based on design forces of the specific application • Allow locks to be changed, or even removed

Concurrency & Synchronization • Concurrency hazard: race conditions • Two or more threads access an object/resource • The interleaving of their statements matters • Some inter-leavings have bad consequences • Synchronization hazard: deadlock • One or more threads access an object/resource • Access to the resource is serialized • Chain of accesses leads to mutual blocking • Clearly, there is a design tension • Must be resolved for each distinct application • Applying patterns helps if guided by design forces

Review: What is a Pattern Language? • A narrative that composes patterns • Not just a catalog or listing of the patterns • Reconciles design forces between patterns • Provides an outline for design steps • A generator for a complete design • Patterns may leave consequences • Other patterns can resolve them • Generative designs resolve all forces • Internal tensions don’t “pull design apart”

Towards a Pattern Language • Key insights • Each pattern solves a particular set of issues • When applied, it may leave others open • Depends on the scale of the context • E.g., ACT does not itself address synchronization • Although it may help in combination with say the TSS pattern • It may also raise new issues • For example, synchronization raises deadlock issues • None of this matters without a context • Patterns are only useful when applied to a design • A design stems from the context (problem+forces) of something we’re trying to build • So, let’s imagine we’re building something

Motivating Scenario • Peer-to-peer client server applications • Each process sends and receives requests • Communicates with many other endsystems at once • Distributed data and analysis components • Spread more or less evenly throughout network • Each has a particular function or kind of information • Endsystems can look up locations of specific components • Directory, naming, and trading services are available • Heterogeneous requests • Long-running independent file retrieval requests • “Send me the news data feed from 04/21/03” • “Send me the stock market ticker for 04/21/03” • Short-duration interdependent analysis requests • “Compute the change (+ or -) for each stock sector” • “Then find the largest change in a stock sector” • “Then find 5 news events with highest correlation to that change”

Motivating Scenario, Continued • Given • A high-quality LAN-scale analysis server … • designed to manage tens of components … • and interact with tens of other similar servers … • all connected via one company’s private network • Assume • Written in C++, leverages C++11/STL features extensively • Design uses Wrapper Façade, other basic patterns • Reactive implementation (initially single-threaded) • We’ll discuss other alternatives in the next part of the course • E.g., proactive approaches using interceptor to mark/notify progress • Goal • Scale this to a business-to-business VPN environment … • … with hundreds instead of tens of nodes locally • … each with hundreds instead of tens of collaborators

Review of Concurrency Patterns • Active Object • Methods hand off request to internal worker thread • Monitor Object • Threads allowed to access object state one-at-a-time • Half-Sync/Half-Async • Process asynchronously arriving work in synchronous thread • Leader/Followers • HS/HA optimization: leader delegates until its work arrives • Thread Specific Storage • Separate storage for each thread, to avoid contention • These complement the synchronization patterns

Review of Synchronization Patterns • Scoped Locking • Ensures a lock is acquired/released in a scope • Strategized Locking • Customize locks for safety, liveness, optimization • Thread-Safe Interface • Reduce internal locking overhead • Avoid self-deadlock • These complement the concurrency patterns

How Do We Get Started? • Simply apply patterns in order listed? • Sounds ad hoc, no reason to think this would work • A better idea is to let the context guide us • Compare current design forces to a pattern context • Let’s start with the most obvious issue • Long-running tasks block other tasks • Long-running tasks need to be made concurrent • Leads to selection of our first pattern • Concurrent handlers for long-running requests • I.e., apply the Active Object pattern

Active Object Context • Pattern’s use in this design • Worker thread in handler • Input thread deposits requests for it • Queue decouples request/execution between threads • LRRH::handle_event • An enqueue method for AO that puts a command object into the queue • Allows AOs to be registered with an asynchronously triggered dispatcher, e.g., an input (or reactor) thread Long Running Request Handlers (LRRH) enqueue requests worker thread input thread

Active Object Consequences • Clear benefits for LRRH • Method invocation & execution are decoupled • Helps prevent deadlock • Reduces blocking • Drawbacks • Queuing overhead • Context-switch overhead • Implementation complexity • Costs well amortized so far • But only for LRRH • So, we can’t apply as is to short-running requests • Next design refinement • Apply Half-Sync/Half-Async Long Running Request Handlers (LRRH) enqueue requests worker thread input thread

Half-Sync/Half-Async Context Mixed Duration Request Handlers (MDRH) • Still want long-running activities • Less overhead when related data are processed synchronously • Also want short-lived enqueue and dequeue operations • Requests arrive asynchronously • Keep reactor and handlers responsive • How to get both? • Make queueing layer smarter • Chain requests by their dependencies • E.g., HS/HA (via pipes & filters) • How this is done is crucial • Blocking input (reactor) thread is bad • So, worker thread chains them before it works on a chain! worker chains related requests chain requests enqueue requests worker thread input thread

HS/HA Consequences, part I • Problem: race condition • Reactor thread enqueues while active object thread dequeues • BTW, this has been an issue • Since we applied Active Object • But we pushed first on concurrency • Could have taken the other branch in our design process first, instead • Emphasizes iterative/rapid design! • Need a mutex • Serialize access to the queue data structure itself • But then need to avoid deadlock • What if an exception is thrown? (E.g., a call to new fails) • Solution: apply scoped locking Mixed Duration Request Handlers (MDRH) worker chains related requests enqueue requests worker thread input thread

Define/identify scopes around critical sections I.e., the active object enqueue and dequeue methods Provide a local guard object on the stack as the first element within the scope of a critical section Guard ensures a thread automatically acquires the lock Guard constructor locks upon entering a critical section Guard ensures thread automatically releases the lock Guard destructor unlocks upon exiting the local scope Guarded locks will thus always be released even if exceptions are thrown from within the critical section Scoped Locking Context

Scoped Locking Consequences • Benefit: increased robustness with exceptions • Liability: deadlock potential when used recursively • Self deadlock could occur if the lock is not recursive • E.g., if worker thread generates requests for itself • Pattern language solution • Apply another pattern! • I.e., strategized locking

Strategized Locking Context & Consequences • Parameterizes synchronization mechanisms • That protect critical sections from concurrent access • Different handlers may need different kinds of synchronization • Null locks (i.e., if we’d ever add passive handlers) • Recursive and non-recursive forms of mutex locks • Readers/writer locks • Decouple handlers and locks • Can change one without changing the other • Enhancements, bug fixes should be straight forward • Flexibility vs. complexity, time vs. space

HSHA Consequences, part II Mixed Duration Request Handlers (MDRH) • Pop back up to HS/HA • Now that we’ve addressed queue locking issues • A new concurrency issue • Chains that are ready, may wait • While thread works on others • Could increase # of MDRH’s • Also increase context switches • A better idea • Notice the chaining of related requests? • Looks like “sorting the mail” • I.e., apply Leader/Followers worker chains related requests enqueue requests worker thread input thread

Leader/Followers Context Mixed Duration Request Handlers (MDRH) • Leader when elected picks an incomplete chain • Appends requests from reactor to growing chains • When a chain is complete • If its for the leader • Leader takes request chain • A new leader is elected • Old leader becomes follower • Old leader works on the chain • If it’s for someone else • Leader hands off to others • Continues to build chains follower threads hand off chains enqueue requests leader thread input thread

Leader/Followers Consequences • Key benefits • Greater cache affinity: thread locality of reference • Context switch only at some events • Liabilities • Some increase in implementation complexity • I.e., to elect leaders, identify whose chain is whose • We’ll just live with this for the purposes here • Some additional overhead • But we already separated reactor from handlers • So impact on lowest asynchronous level is reduced • We’ll go a couple steps further in reducing this • Overall, benefits of applying LF here outweigh costs

TSS Context & Consequences Mixed Duration Request Handlers (MDRH) • Must lock thread hand-off • But access w/in a chain is single-threaded and thus ok • Can save results locally • Put request result in TSS • Retrieve as input for next request in that chain • Benefits • Avoids locking overhead • Increases concurrency • Liabilities • Slightly more complex to store and fetch results • Portability of TSS (may need to emulate in code) follower threads hand off chains enqueue requests leader thread input thread

A Few Remaining Issues Mixed Duration Request Handlers (MDRH) • Shared services across threads (naming, lookup) • Threads need safe access • Design trade-off • Between simplicity and potential concurrency improvement • Alternative patterns • Monitor Object • E.g., for new services • Thread Safe Interface • E.g., for legacy services follower threads hand off chains enqueue requests leader thread input thread

Monitor Object Context & Consequences follower threads • Threads use common services • E.g., Naming, Directory • Need thread-safe service access • Single copy of each service • Inefficient to duplicate services • Threads access them sporadically • Could just make services active • More complex if even possible • Solution: Monitor Object • Synchronize access to the service • Consequences • If worried about performance, can yield on condition variables • Assume that’s not an issue here (reasonable for Naming, etc.) common service

Since synchronous service performance is fine Go for simplicity, low-overhead Avoids self-deadlock from intra-component method calls Avoid complexity of condition based thread yield Minimizes locking overhead Separates Concerns Interface methods deal with synchronization Implementation methods deal with functionality Only option with some single-threaded legacy services Can’t modify, so wrap instead Thread-Safe Interface Context and Consequences follower threads thread-safe service interface service implementation

Pattern Language Review • We first applied Active Object • Made long-running handlers concurrent • Then we applied Half-Sync/Half-Async • For chains of short-running requests • Next were scoped & strategized locking • To make thread hand-off safe, efficient • Then we applied Leader/Followers • To reduce thread context switches

Pattern Language Review, Continued • We applied TSS next • Efficient access to separate information for each thread can help reduce contention • Finally we applied Monitor Object and/or Thread-Safe Interface • For synchronized access to common (possibly legacy) services

Concluding Remarks • Design mastery is often not based on novelty • Though occasionally it can help • It is mostly a matter of careful observation • of the current context as the design unfolds … • and how consequences of each decision shape later ones • Study patterns and pattern languages • Benefit from prior experience with patterns • Including published experience of other “design masters” • Practice working through design examples like this • Build intuition about how patterns fit together • Mastery is a journey not a destination • Learn from every design, yours and others

E81 CSE 532S: Advanced Multi-Paradigm Software Development