150 likes | 163 Views
This review discusses the concept of User-Level Interprocess Communication (IPC) for shared memory multiprocessors, focusing on the benefits and challenges of using RPC and LRPC within a single machine. It also explores the opportunities for SMP optimizations and introduces the URPC design for isolating the components of IPC. The performance, limitations, and considerations for SMP-based RPC design are discussed.
E N D
User-Level Interprocess Communication for Shared Memory MultiprocessorsBershad, B. N., Anderson, T. E., Lazowska, E.D., and Levy, H. M. Presented by Chris Eigner
Review of LRPC • RPC concept can be used within a single machine as IPC • Caller/callee in RPC are on same machine…room for optimizations • Run client thread in context of server, avoid scheduler • Argument stacks allocated in shared memory, avoid message copying • Domain caching to reduce context-switch overhead
Problems with RPC/LRPC • Kernel mediates every cross-address space call - 70% of total overhead • Poor performing cross-address space communication • Kernel-level communication + user-level thread management • Opportunity for more SMP optimizations
SMP Optimizations • No need to switch processor to another address space • Remove kernel from equation! • Address spaces share memory directly • Processor reallocation can be avoided • Preserves valuable cache/TLB contexts • Cost can be amortized over independent calls • Inexpensive thread management; orders of magnitude less than kernel-level.
URPC Responsibilities • URPC design isolates three components of IPC • Thread management • Data transfer • Processor reallocation
Thread Management • Context switch • Switching processor to another thread in same address space • Processor reallocation • Reallocating processor to a thread in a different address space • via Processor.Donate
Data Transfer • Bi-directional shared memory queue • Test-and-set locks (non-spinning) on each end • Client/server model • send, receive, start, stop
Processor Reallocation • URPC makes certain assumptions to reduce processor reallocation • Client has other threads to run or incoming messages • Server has or will have a processor to service message • Allows inexpensive context switch during blocking phase of cross-address call • Enables parallel execution of URPC while avoiding processor reallocation
Performance • Firefly workstation • Four C-VAX processors • 32Mb RAM!!! • Taos OS • Provided kernel level threads • FastThreads • User-level thread library • URPC • Channel management • Message primitives
Performance worse than LRCP
Deficiencies • Optimistic assumptions won’t always hold • Single-threaded applications • High-latency I/O • Processor reallocation occurs after two optimization checks (approx. 100 μs) • Is there an idle processor? • Is there an underpowered address space to which it can be reallocated? • Voluntary return of processors can’t be guaranteed • Two processors for single computation, only one active at a time
Summary SMP allows new freedoms in RPC design • No need to switch processor to another address space • Preserves valuable cache/TLB contexts • 1-2 orders of magnitude improvement • But, not ideal for all application types • Single-threaded applications • High-latency I/O