280 likes | 299 Views
E N D
Why Events Are A Bad Idea(for high-concurrency servers)Rob von Behren, Jeremy Condit and Eric BrewerComputer Science Division, University of California at Berkeley{jrvb, jcondit, brewer}@cs.berkeley.eduhttp://capriccio.cs.berkeley.edu/Final Version, Proceedings of HotOS IXLihue, Kauai, Hawaii, May 2003 Presented by: Ryan Ledbetter CS533 - Concepts of Operating Systems Portland State University, Professor Jonathan Walpole January 26, 2009
Events are better, according to other literature • Inexpensive synchronization (cooperative multitasking) • Lower overhead for state (short stacks) • Better scheduling and locality (application-level information) • More flexible control flow (not just call/return)
Threads are better, according to von Behren et al. • Threads are more natural • Historically threads had some drawbacks, but not now • With small improvements to compilers and runtime systems older drawbacks are eliminated • It is easier to implement compiler optimization to a threaded program
Threads Vs. Events • Old debate • Lauer and Needham discussed in1978 • Process-based and message-passing systems are duals • Conclusion: • Both systems if implemented correctly should yield similar performance • Which one to use is the one that is a better fit for the task at hand
What they missed • Cooperative scheduling is used by most events for synchronization • Use of shared memory/global data structures, which Lauer and Needham said was not typical • SEDA is the only event system that matches Lauer and Needham
“Problems” with Threads • Performance: Many attempts to use threads for high concurrency have not performed well • Control Flow: Threads have restrictive control flow • Synchronization: Thread synchronization mechanisms are too heavyweight • State Management: Thread stacks are an ineffective way to manage live state • Scheduling: The virtual processor model provided by threads forces the runtime system to be too generic and prevents it from making optimal scheduling decisions.
Performance • Historically accurate, due to poor implementations: • did not account for both high concurrency and blocking operations • Overhead of O(n) operations (where n is the number of threads) • High context switch overhead • None of the above is a property of the threaded model
Performance, the Fix • Modifications to the GNU Pth user-level threading package • Removed most of the O(n) operations • Repeated SEDA’s threaded server benchmark • Result: threaded server scales nicely to 100,000 threads (matching the event-based server)
Control Flow • Threads push the programmer to think too linearly • This may cause the use of more efficient control flow patterns to be overlooked
Control Flow, reply • Flash, Ninja, SEDA, and TinyOS use one of three control flow patterns: • Call/Return • Parallel Calls • Pipelines • These can be naturally expressed using threads • Complex patterns are rare
Control Flow, reply cont. • Event-based systems often obfuscate the control flow of the system • Programmer must keep track of call and return state (which may be in different parts of the code) • Programmers usually have to use “stack ripping” to save state • Race conditions can arise and can be hard to find.
Control Flow, Threads are better • Threads allow for more natural encapsulation of the state • Calls and returns are grouped • Debugging with current tools is easier as the call stack contains the live state
Synchronization • Events can get “free” synchronization because of cooperative multitasking • No need for the runtime system to worry about mutexes, wait queues, etc.
Synchronization, reply • The Benefit is from cooperative multitasking not events themselves • Threads can have the same “free” synchronization • NOTE: cooperative multitasking only works with uniprocessors
State Management • Threads historically face the decision of wasting virtual address space or risking a stack overflow • Events usually have short stacks that fully unwind after each event • Event systems usually minimize state before a blocking point (as state is managed by the programmers)
State Management, reply • Dynamic stack growth would solve the problem of over-allocating or overflowing the stack • Provide automatic state management using the call stack, and reduces state at blocking calls
State Management, Exceptions and State Lifetime • State cleanup is easier with threads (stack allocated) • Event-based state is usually heap allocated and may be difficult to know when to free the memory: • Memory leaks • Access deallocated memory • Garbage collection (like Java) is inappropriate for high-performance
Scheduling • Events have more control over scheduling because it can be done at the application level • Applications can choose the best schedule (shortest time, priority, etc.) • Can group runs of the same type of event together.
Scheduling, reply • Lauer and Needham’s duality says we should be able to do the same scheduling with threads
Why Not Fix Events • Create tools/languages that force: • Call\reply matching • Live state management • Shared state management • Would basically be the same as threads? • Some tools\techniques have been created and syntax is very similar to threads • Improved events == threads
Compiler Support for Threads • With minor modification the compiler can improve safety, programmer productivity and performance • Mainly compilers can address three key areas: • Dynamic Stack Growth • Live State Management • Synchronization
Compiler, Dynamic Stack Growth • A compiler could determine an upper bound on the stack space a function call will need • Thus it could determine when growth may be needed • Recursion and function pointers are obstacles, but can be dealt with further analysis
Live State Management • The issue is that state is not minimized before a subroutine call is made • Compilers could look ahead at the code and see what temporary variables can safely be popped off or if the entire frame could be popped off • Warn users if large amounts of state data is waiting for a blocking call
Synchronization • Compilers could better analyze the code for race conditions and raise a warning if one is likely • Support atomic sections: • nesC, a language for networked sensors supports atomic sections • Atomic section cannot yield or block
The Test • Compared two models with two servers: • Event-based model • Haboob (SEDA) • Threaded model • Knot, tested two version • Knot-C favored current connections • Knot-A favored accepting new connections • Both used the standard poll system call
Stats • Knot-C • 700 Mbits/s maximum bandwidth • 1024 clients at peak • Haboob • 500 Mbits/s maximum bandwidth • 512 clients at peak
Conclusion • Event-based Haboob must context switch between event handlers • 30,000 context switches per second at peak (6 times that of Knot) • Small modules == lots of queuing • Temporary object creation, thus garbage collection • Order of events determined at runtime • Reducing compiler optimizations