410 likes | 523 Views
Concurrency, Threads, and Events. Ken Birman (Based on a slide set prepared by Robbert van Renesse). Summary Paper 1. Using Threads in Interactive Systems: A Case Study (Hauser et al 1993) Analyzes two interactive computing systems Classifies thread usage
E N D
Concurrency,Threads, and Events Ken Birman (Based on a slide set prepared by Robbert van Renesse)
Summary Paper 1 • Using Threads in Interactive Systems: A Case Study (Hauser et al 1993) • Analyzes two interactive computing systems • Classifies thread usage • Finds that programmers are still struggling • (pre-Java) • Limited scheduling support • Priority-inversion
Summary Paper 2 • SEDA: An Architecture for Well-Conditioned, Scalable Internet Services (Welsh, 2001) • Analyzes threads vs event-based systems, finds problems with both • Suggests trade-off: stage-driven architecture • Evaluated for two applications • Easy to program and performs well
What is a thread? • A traditional “process” is an address space and a thread of control. • Now add multiple thread of controls • Share address space • Individual program counters and stacks • Same as multiple processes sharing an address space.
Thread Switching • To switch from thread T1 to T2: • Thread T1 saves its registers (including pc) on its stack • Scheduler remembers T1’s stack pointer • Scheduler restores T2’ stack pointer • T2 restores its registers • T2 resumes • Two models: preemptive/non-preemptive
Thread Scheduler • Maintains the stack pointer of each thread • Decides what thread to run next • E.g., based on priority or resource usage • Decides when to pre-empt a running thread • E.g., based on a timer • May need to deal with multiple CPUs • But not usually • “fork” creates a new thread • Blocking or calling “yield” lets scheduler run
Synchronization Primitives • Semaphores • P(S): block if semaphore is “taken” • V(S): release semaphore • Monitors: • Only one thread active in a module at a time • Threads can block waiting for some condition using the WAIT primitive • Threads need to signal using NOTIFY or BROADCAST
Uses of threads • To exploit CPU parallelism • Run two CPUs at once in the same program • To exploit I/O parallelism • Run I/O while computing, or do multiple I/O • Listen to the “window” while also running code, e.g. allow commands during an interactive game • For program structuring • E.g., timers • To avoid deadlock in RPC-based applications
Hauser’s categorization • Defer Work: asynchronous activity • Print, e-mail, create new window, etc. • Pumps: pipeline components • Wait on input queue; send to output queue • E.g., slack process: add latency for buffering • Sleepers & one-shots • Periodic activity & timers
Categorization, cont’d • Deadlock Avoiders • Avoid deadlock through ordered acquisition of locks • When needing more locks, roll-back and re-acquire • Task Rejuvenation: recovery • Start new thread when old one dies, say because of uncaught exception
Categorization, cont’d • Serializers: event loop • for (;;) { get_next_event(); handle_event(); } • Concurrency Exploiters • Use multiple CPUs • Encapsulated Forks • Hidden threads used in library packages • E.g., menu-button queue
Common Problems • Priority Inversion • High priority thread waits for low priority thread • Solution: temporarily push priority up (rejected??) • Deadlock • X waits for Y, Y waits for X • Incorrect Synchronization • Forgetting to release a lock • Failed “fork” • Tuning • E.g. timer values in different environment
Problems he neglects • Implicit need for ordering of events • E.g. thread A is supposed to run before thread B does, but something delays A • Non-reentrant code • Languages lack “monitor” features and users are perhaps surprisingly weak at detecting and protecting concurrently accessed data
Criticism of Hauser • He assumes superb programmers and seems to believe that “most” programmers won’t use threads (his example systems are really platforms, not applications) • Systems old but/and not representative • Pre-Java and C# • And now there are some tools that can help discover problems
What is an Event? • An object queued for some module • Operations: • create_event_queue(handler) EQ • enqueue_event(EQ, event-object) • Invokes, eventually, handler(event-object) • Handler is not allowed to block • Blocking could cause entire system to block • But page faults, garbage collection, …
Example Event System (Also common in telecommunications industry, where it’s called “workflow programming”)
Event Scheduler • Decides which event queue to handle next. • Based on priority, CPU usage, etc. • Never pre-empts event handlers! • No need for stack / event handler • May need to deal with multiple CPUs
Synchronization? • Handlers cannot block no synchronization • Handlers should not share memory • At least not in parallel • All communication through events
Uses of Events • CPU parallelism • Different handlers on different CPUs • I/O concurrency • Completion of I/O signaled by event • Other activities can happen in parallel • Program structuring • Not so great… • But can use multiple programming languages!
Hauser’s categorization ?! • Defer Work: asynchronous activity • Send event to printer, etc • Pumps: pipeline components • Natural use of events! • Sleepers & one-shots • Periodic events & timer events
Categorization, cont’d • Deadlock Avoiders • Ordered lock acquisition still works • Task Rejuvenation: recovery • Watchdog events?
Categorization, cont’d • Serializers: event loop • Natural use of events and handlers! • Concurrency Exploiters • Use multiple CPUs • Encapsulated Events • Hidden events used in library packages • E.g., menu-button queue
Common Problems • Priority inversion, deadlock, etc. much the same with events
Threads vs. Events • Events-based systems use fewer resources • Better performance (particularly scalability) • Event-based systems harder to program • Have to avoid blocking at all cost • Block-structured programming doesn’t work • How to do exception handling? • In both cases, tuning is difficult
Both? • In practice, many kinds of systems need to support both threads and events • Threaded programs in Unix are the common example of these, because window systems use events • The programmer uses cthreads or pthreads • Major problem: the UNIX kernel interface wasn’t designed with threads in mind!
Why does this cause problems? • Many system calls block the “process” • File read or write, for example • And many libraries aren’t reentrant • So when the user employs threads • The application may block unexpectedly • Limited work-around: add “kernel threads” • And the user might stumble into a reentrancy bug
Events as seen in Unix • Window systems use small messages… • But the “old” form of events are signals • Kernel basically simulates an interrupt into the user’s address space • The “event handler” then runs… • But can it launch new threads? • Some system calls can return EINTR • Very limited options to “block” signals in critical sections
How people work around this? • They try not to do blocking I/O • Use asynchronous system calls… or select… or some mixture of the two • Or try to turn the whole application into an event-driven one using pools of threads, in the SEDA model (more or less) • One dedicated thread per I/O “channel”, to turn signal-style events into events on the event queue for the processing stage
This can be hard, but it works • Must write the whole program and have a way to review any libraries it uses! • One learns, the hard way, that pretty much nothing else works • Unix programs built by inexperienced developers are often riddled with concurrency bugs!
SEDA • Mixture of models of threads and (small message-style) events • Events, queues, and “pools of event handling threads”. • Pools can be dynamically adjusted as need arises. • Similar to Javabeans and EventListeners?
Authors: “Best of both worlds” • Ease of programming of threads • Or even better • Performance of events • Or even better
Threads Considered Harmful • Like goto, transfer to some entry in program • In any scope • Destroys structure of programs • Primitive Synchronization Primitives • Too low-level • Too coarse-grained • Too error-prone • Prone to over-specification
Example: create file • Create file • Read current directory (may be cached) • Update and write back directory • Write file
Thread Implementations • Serialize: op1; op2; op3; op4 • Simplest and most common • Use threads • Requires at least two semaphores! • Results in complicated program • Simplified threads • Create file and read directory in parallel • Barrier • Write file and write directory in parallel • Over-specification!
Event Implementation • Create a dummy handler that awaits file creation and directory read events and then send an event to update the directory. • Not great…
GOP: Discussion • Specifies dependencies at a high-level • No semaphores, condition variables, etc • No explicit threads nor events • Can easily be supported by many languages • C, Java, etc. • Top-down specification • cmp with make, prolog, theorem prover • Exception handling easily supported
Conclusion • Threads still problematic • As a code structuring mechanism • High resource usage • Events also problematic • Hard to code, but efficient • SEDA and GOP address shortcomings • But neither can be said to have taken hold
Issues not discussed • Kernel vs. User-level implementation • Shared memory and protection tradeoffs • Problems seen in demanding applications that may launch enormous numbers of threads