480 likes | 680 Views
Adaptive CPU Allocation for Software based Router Systems. Puneet Zaroo. Software based routers. Implement packet forwarding/processing in software. E.g a PC with multiple NICs. Provide value added services like encryption, network address translation esp. at the network edge. Issues
E N D
Adaptive CPU Allocation forSoftware basedRouter Systems Puneet Zaroo
Software based routers • Implement packet forwarding/processing in software. • E.g a PC with multiple NICs. • Provide value added services like encryption, network address translation esp. at the network edge. • Issues • Software architecture. • Per flow threads / per-packet threads • Division of input, forwarding and output functions • CPU scheduling. • How to determine CPU shares • How to enforce CPU shares.
Objective • Leverage the advantages of a component based software router system. • Flexibility in designing routers • Reusability of software components • Dynamic addition of element modules • Overlay a QoS provisioning mechanism on top of the component based system. • Develop an adaptive QoS system • Adaptive to varying input rate and per-packet processing costs.
Some Software Router Systems • Router Plugins : ETH Zurich, Uwash St. Louis • Per flow code modules or plugins. • Implemented in the NetBSD kernel. • Click Modular router : MIT • Routers made of elements composed into a flow graph. • ANTS • Programmable and customizable networks. • Customizable applications acting on packets / packets carrying code as well as data. • X-kernel : University of Arizona • Object oriented interface to protocols. • Can be used on end systems as well as routers. • Scout : University if Arizona, Princeton University • Communication oriented OS based on x-kernel. • Path based abstraction. • Advanced CPU scheduling.
OS support for CPU scheduling • Scout • Proportional scheduling. • CPU balance (extension of work on livelock) • Resource Containers : Rice University • Decoupling of protection domain/resource domain. • Proper accounting of resources to processes. • Resources include threads as well as kernel data structures and memory,bound to containers. • E.g a web server serving multiple connections. • Processor Capacity reserves : CMU • Provides support for both time-sharing and real-time systems. • The OS enforces the reservations (cpu share, time period). • Applications free to change their reservations subject to admission control. • Nemesis : Cambridge • OS does low level resource multiplexing. • Avoiding QoS cross-talk • Support for I/O in user level libraries.
Click • Composable flow-graphs from router elements • Packets travel along graph edges • Element based processing (push/pull). • Element based scheduling. • Multithreaded SMP Click • Issues in flow level QoS on top of an element based architecture • Flow level accounting and scheduling. • CPU balance b/w input, output and processing. • CPU conservation of idle elements.
CROSS/Linux – Resource reservation with containers • Containers • Group of related elements • Elements doing per flow processing. • Container – CPU resource reservation unit. • Why use containers and not flows ? • Types of Containers • Input • Output • Forwarding • Best Effort • QoS - Packet rate reservations
CROSS/Linux - CPU scheduling • Three level scheduler • Linux schedules CROSS • Linux process scheduler • CROSS schedules Containers • Proportional (Dynamic stride scheduling) • Containers schedule Elements • Simple Round Robin scheduling
CROSS/Linux – Architectural Enhancements to Click • CPU conservation through sleep/wakeup • Elements tested for scheduling eligibility • Containers tested for scheduling eligibilty • Notifier Queues - wake up elements (make eligible for scheduling) • Delayed wakeup • Network interface Input Element • Switching between polling and interrupt • Based on a threshold packet input rate to reduce programmed I/O overhead • Topology discovery • Discovering input/output queues for a container
CROSS/Linux – Enhancements to Click • virtual Interface queues – especially for interface statistics gathering • Linux /proc interface – • One directory for each container • Directory provides information about • Container tickets • CPU cycles consumed • Packet rate/drop rate • Elements • Input/Output queues • Set container tickets
CROSS/Linux – Share adaptation • Why ? • Inability to do a-priori CPU share calculation • Variations in packet input rate • Variations in per-packet processing cost • How ? • Scheduler for each container keeps track of • Packet input rate. • Packet drop rate. • CPU cycles used. • Recomputes container shares to remove packet drops.
CROSS/Linux – Share adaptation • Statistics maintained by Queues • Packet rates • Packet drop rates • Queues used to connect containers • Packet pass/drop rates at Queues indicate the difference between the required and the actual CPU shares for the container
Share adaptation Algorithm • Invoked every 1 second • Notation used • T – Ticket share • C – Current CPU share • p – Input packet rate • d – packet drop rate • m – maximum input rate • General idea • Increase ticket share of a container so that the drop rate is removed at all the containers
Input Container share adaptation (Issues) • Pass as many packets as possible upto a maximum. • How to arrive at this maximum? • Forwarding more than the maximum adversely affects the effective router throughput. • Reduce share on observing over allocation.
Input Container – Share adaptation(Algorithm) if p > m /* Input rate too high */ /* reduce share */ T = C * (m/p) else if d > 0 /* Increase share to */ /*remove packet drops */ drate = min(d + p,m) T = C * (drate/p) else if (T – C) >= delta /* Over allocation */ /* reduce share */ T = T – eps
QoS container – Share adaptation(Issues) • Always forward till reserved rate. • Target a forwarding rate range. • Reduce share in case of over allocation
QoS container – Share adaptation(Algorithm) If p ε[ R – Dt, R + Dt] /* No change */ return if p > R + Dt /* Reduce share */ T = C * (R/p) else if d > 0 /* Increase share */ drate = min(p + d,R) T = C * (drate/p) else if (T-C) >= delta /* Reduce share */ T = T – eps
Output Container – Share adaptation (Issues) • Try to forward all that is received • Throttling if any has happened upstream • Reduce share in case of over allocation
Output Container – Share adaptation (Algorithm) if d > 0 /*Increase share */ T = C * (1 + d/p) else if (T – C) >= delta / * Reduce Share */ T = T - eps
Best Effort Container – Share adaptation • No action taken • System makes no guarantees
Discussion • Packet rate based reservation • Reservations based on packet rates more intuitive • CPU shares may vary for the same packet rates • C (Actual share) - How is it calculated? • Input container • Only include CPU cycles used in packet processing as opposed to idle polling. • Other containers • Easy to calculate since no idle polling. • m (Maximum forwarding rate) • Constant determined at router initialization • Evaluated at each iteration
Evaluation • Using a simulator • Calculates the forwarding rate , drop rate based on the CPU shares. • Mimics the actions of the adaptive algorithm • Eases loading the “router” and testing of diverse workloads • Using a real implementation • CROSS/Linux on 866 Mhz Pentium III CPU.
Adaptive vs. Non Adaptive(Experimental setup) • Input (2 µs), Output (2 µs) , Best Effort Container (6 µs). • Router – 1 MHz CPU => max forwarding = 100,000 packets/s • Static ticket assignment = 1:1:1 • Input varied for 0 to 110,000 packets/s in increments of 10,000 packet/s every 10s.
Adaptive vs. Non Adaptive(Maximum loss free forwarding rate)
Variable packet processing time(Experimental Setup) • Input (2µs), Best Effort/QoS (6µs), Output Container (2µs) • Observe different convergence behavior for QoS / Best Effort • Router – 1 MHz CPU => max forwarding rate initially = 100,000 packets/s • Constant input = 50,000 packets/s • Per packet processing cost increased by 2 µs every 10 secs. • Max. forwarding rate = 50,000 packets/s at t=50s.
Adaptation in m • Hard to determine m at router initialization • May vary with variations in per packet processing costs. m = maxi (TOTAL_CPU_CPS/cpu_cpp(ci)) where ciε C • TOTAL_CPU_CPS - Total CPU cycles per second available to the router • cpu_cpp(ci) - cycles/packet being used by the flow serviced by container ci cpu_cpp(ci) = cpu_cpi() + cpu_cycles(ci)/num_packets(ci) + cpu_cpo() • C - The set of containers servicing active flows
Fixed vs adaptive m - (Experimental setup) • Input (8µs), Best Effort/QoS (1µs), Output Container (1µs) • Router – 1 MHz CPU => max forwarding rate, initially = 100,000 packets/s • Constant input = 50,000 packets/s • Per packet processing cost increased by 2 µs every 5 secs • Max forwarding rate = 50,000 packets/s at t=30 s.
Fixed vs. Adaptive m(Best Effort, QoS , Theoretical maximum)
Advanced Adaptation in m • Previous algorithm gives too much stress to the least expensive flow. • Fine if all packets destined for that flow. • The packet rate to different flows can be variable. • m =(TOTAL_CPU_CPS/weighted_cpu_cpp) • weighted_cpu_cpp = Σ (cpu_cpp(ci) * rate(ci))/ (Σ rate(ci)) where ciεC
Adaptive m vs. advanced adaptive m(Experimental Setup) • Input container (5 µs), Output Container(5 µs) • Router (1 MHz CPU) • 2 flows • QoS container (50,000 p/s,30 µs) => max forwarding rate achievable = 25,000 packets/s • Best Effort container (3 µs) => max forwarding rate achievable = 77,000 packets/s • Input rate to best effort container = 500 packets/s • Input rate to QoS container varied from 15,000 packets/s to 50,000 packets/s in increments of 5,000 packets/s every 5 s.
Adaptive m vs. advanced adaptive m(Forwarding rate vs. time)
Evaluation on a Router • CROSS/Linux software router platform • P III 866 MHZ pc. • 3 network interface cards.
QoS Forwarding (Experimental setup) • 866 MHz , PIII router • Input Container(4.5 µs) , Best Effort Container(3 µs),QoS container (32,000 packets/s), Output Container (4.9 µs) • 3 different per – packet processing costsfor the QoS container • 3, 9.7 and 15.2 µs • Input to QoS => 32,000 packets/ • Input to Best Effort => 27,000 packets/s
Effective Forwarding rate(Experimental setup) • Input (4.5 µs), best effort (8.3 µs) and output (4.9 µs) • Maximum forwarding rate = 57,000 p/s • 3 different scenarios • No Adaptation • CPU share Adaptation and m = 65000 packets/s • CPU share Adaptation and m = 110000 packets/s
Future Work • Conjoint CPU – Buffer Allocation • Insufficient CPU share => always packet drops • Once sufficient CPU shares, more buffering => more efficiency • More buffering => higher packet delays and packets getting dropped at line cards. • Share adaptation between Linux/CROSS • Can use the SFQ scheduler already implemented
Conclusion • Provide a QoS provisioning layer on top of a component based system. • Adaptive in response to variable packet input and processing costs.