150 likes | 249 Views
User-Level Interprocess Communication for Shared Memory Multiprocessors Bershad, B. N., Anderson, T. E., Lazowska, E.D., and Levy, H. M. Presented by Chris Eigner. Review of LRPC. RPC concept can be used within a single machine as IPC
E N D
User-Level Interprocess Communication for Shared Memory MultiprocessorsBershad, B. N., Anderson, T. E., Lazowska, E.D., and Levy, H. M. Presented by Chris Eigner
Review of LRPC • RPC concept can be used within a single machine as IPC • Caller/callee in RPC are on same machine…room for optimizations • Run client thread in context of server, avoid scheduler • Argument stacks allocated in shared memory, avoid message copying • Domain caching to reduce context-switch overhead
Problems with RPC/LRPC • Kernel mediates every cross-address space call - 70% of total overhead • Poor performing cross-address space communication • Kernel-level communication + user-level thread management • Opportunity for more SMP optimizations
SMP Optimizations • No need to switch processor to another address space • Remove kernel from equation! • Address spaces share memory directly • Processor reallocation can be avoided • Preserves valuable cache/TLB contexts • Cost can be amortized over independent calls • Inexpensive thread management; orders of magnitude less than kernel-level.
URPC Responsibilities • URPC design isolates three components of IPC • Thread management • Data transfer • Processor reallocation
Thread Management • Context switch • Switching processor to another thread in same address space • Processor reallocation • Reallocating processor to a thread in a different address space • via Processor.Donate
Data Transfer • Bi-directional shared memory queue • Test-and-set locks (non-spinning) on each end • Client/server model • send, receive, start, stop
Processor Reallocation • URPC makes certain assumptions to reduce processor reallocation • Client has other threads to run or incoming messages • Server has or will have a processor to service message • Allows inexpensive context switch during blocking phase of cross-address call • Enables parallel execution of URPC while avoiding processor reallocation
Performance • Firefly workstation • Four C-VAX processors • 32Mb RAM!!! • Taos OS • Provided kernel level threads • FastThreads • User-level thread library • URPC • Channel management • Message primitives
Performance worse than LRCP
Deficiencies • Optimistic assumptions won’t always hold • Single-threaded applications • High-latency I/O • Processor reallocation occurs after two optimization checks (approx. 100 μs) • Is there an idle processor? • Is there an underpowered address space to which it can be reallocated? • Voluntary return of processors can’t be guaranteed • Two processors for single computation, only one active at a time
Summary SMP allows new freedoms in RPC design • No need to switch processor to another address space • Preserves valuable cache/TLB contexts • 1-2 orders of magnitude improvement • But, not ideal for all application types • Single-threaded applications • High-latency I/O