340 likes | 454 Views
Efficient Use-Space Protocol Implementations with QoS Guarantees Using Real-Time Upcalls. R. Gopalakrishnan and G. M. Parulkar IEEE/ACM Transaction on Networking August 1998. Abstract.
E N D
Efficient Use-Space Protocol Implementations with QoS Guarantees Using Real-Time Upcalls R. GopalakrishnanandG. M. Parulkar IEEE/ACM Transaction on Networking August 1998
Abstract • Two requirements for protocol implementation to be able to support QoS guarantees within the endsystems are: • Efficient processor scheduling (application and protocol) for AP and protocol processing. • Efficient mechanisms for data movement. • For efficient processing: • Real-Time Upcall (RTU) in the OS • Real-time concurrency. • minimal overhead for concurrency control and context switching compared to thread-based approaches.
Abstract (continued) • For efficient data movement: • Direct movement of data between the application and the network adapter; • Batching the network I/O operations to reduce context switches; • Header-data splitting at the receiver to keep bulk data page aligned. • Experiments: RTU-based TCP and UDP
I. Introduction • A growing need to provide support for multimedia processing with computer OS’s. • The transfer of continuous media (CM) data over the network and its processing at the endsystem must be in such a way that its periodic (or real-time) nature is preserved. • QoS guarantees: • for Data Transfer: Network (ATM + RSVP) • for Processing: OS resource management
I. Introduction • The implementations in this paper: • 1) Real-Time Upcall (RTU) • Concurrent protocol processing in user-space • Processing guarantees • 2) User-Kernel Shared-Memory Facility • Efficient data movement • Efficient network input-output
A. Summary of Main Ideas and Results • 1) The RTU Mechanism • The RTU handler function gets a guaranteed share of the CPU over periodic intervals in time. • Scheduling both the protocol and application code. • The main motivation for the RTU mechanism:Usually, protocol processing is iterative and operates on fixed-sized protocol data units. • RTU handlers tracks their CPU usage in terms of number of PDU’s rather than CPU time. • RTU’s are scheduled in the cooperative style rather than the preemptive style used for real-time threads.
A. Summary of Main Ideas and Results • 1) The RTU Mechanism (continued) • A RTU handler executes periodically based on its priority, where each execution consists of a sequence of atomic iterations with a scheduling opportunity at iteration boundaries. • Unlike threads, RTU’s in an address space share a single stack. • No need to save the state per context switch. • Delaying preemption until an iteration boundary simplifies real-time concurrency control.
A. Summary of Main Ideas and Results • 2) Efficient Data movement • Based on user-kernel shared-memory mechanism. • Support for : • Direct movement of data between application and adapter buffers to avoid data copying. • Batching of network I/O operations to reduce context switches. • Lock-free receive and transmit queues to avoid synchronization overheads. • To enable received data to be remapped rather than copied.
A. Summary of Main Ideas and Results • 3) Important Experimental Results • Motivation: By combining efficient data movement and scheduling mechanisms, to provide guaranteed throughtput and delay performance for protocol such as TCP/IP. • 133M Hz Pentium, NetBSD OS, 155Mb/s ATM LAN, TCP/IP and UDP/IP. TCP/IPMax. ThroughputRound-Trip Time RTU-Based 120Mb/s 2.3ms Non-RTU 80Mb/s 4.3ms
II.The QoS Framework • Four components: • 1) QoS Specification • Four application classes (for simplicity) • Isochronous (等時性) media • Bulk data • Low-bandwidth message streams • High-bandwidth message streams • Quantitative parameters • e.g. Video Stream: frame rate average frame size
II.The QoS Framework • 2) QoS Mapping (Qos Req. Resource Req)
II.The QoS Framework • 3) QoS Enforcement • QoS mapping Resource allocation during the setup phaseImplementing QoS enforcement Resource scheduling (e.g. CPU) during the data transfer phase
II.The QoS Framework • 4) Protocol Implementation Model • Protocol code • to reduce the overheadin data movement and context switching
A. CPU Requirements for Protocol Processing • This paper mainly deals with QoS enforcement and protocol implementation issues. • Period Processing Model (PRM) • Adequacy of Period Model • Easily implemented, • Amendable
III. Implementing the Period Processing Model • Almost no OS closely integrates protocol processing with real-time scheduling. • UNIX: time-sharing • Real-time thread based (RT-Mach、Solaris): • Overhead of multi-threading. • Cost of real-time concurrency control. • Increased context switching due to strict preemption. • Solaris provides only fixed number of real-time priorities.
A. The RTU Approach • The organization of RTU facility: * RTU on top of normal scheduling* RTU takes precedence over normal processes.* Modify the process dispatcher only.
B. The RMDP Scheduling Policy • The scheduler informs the running task (RTU handler) of arrival of a higher priority RTU by writing into a shared-memory variable. • Since RTU handlers are written in an iterative fashion, the running RTU handler checks the the shared variable after each iteration, and yields the CPU if required. • Thus preemption can be delayed for the during of one iteration.d
C. Reactive RTU’s • Low-delay streams are assigned a period of zero, that is the highest priority.Note: The priority is inversely proportional to its period.
D. Other Applications of RTU’s • MOD (Multimedia-on-Demand) • MOD ServerMOD Client • MPEG time (not frames) • Real-Time CORBA • Kernel RTU • The RTU handlers are kernel functions as opposed to being in user processes.
IV. Problem Statementand Solution Outline • KRP model: Kernel-Resident Protocol • Data has to be moved between the application buffers and kernel buffers, and between the kernel buffers and the network adapter.overheads of data movement and context switchincreased cost of protocol processing &inefficient • ALP model: Application-Level Protocol
A. Protocol Processing Overheads • 1) System Calls for Network I/O • 2) Data Movement • 3) Asynchronous Event Processing • packet arrivals and timer expirations interrupts • 4) Scheduling Concurrent Protocol Activities
B. Solution Outline • User-Kernel Shared-Memory
V. User-Kernel Shared Memory Facility CAR=Comm.ARea
V. User-Kernel Shared Memory Facility • 1) Tradeoffs with Wired Pages • Wired pages: to move data directly between adapter and user space. • 2) Lock-Free Buffer Operations • 3) Data Movement Without Copying and System Calls
VI. Protocol Implementation Using RTU’s -TCP/IP Example • 1) TCP Input Processing • 2) TCP Output Processing • 3) TCP Timer Processing • 4) Comparison to Kernel TCP Implementation TCP Timer Processing
C. Coexistence of Multiple Stream Types BG = Bandwidth Guaranteed
IX. Conclusions • Protocol implementations with QoS Guarantees in Use-Space Using Real-Time Upcalls • RTU + RMDP scheduling • The first of its kind to combine both high efficiency as well as end-to-end guarantees.