270 likes | 355 Views
Routers with a Single Stage of Buffering. ACM SIGCOMM Friday, Aug 23 rd 2002. Sundar Iyer , Rui Zhang, Nick McKeown High Performance Networking Group, Stanford University, http://yuba.stanford.edu . Contents. Background & motivation
E N D
Routers with a Single Stage of Buffering ACM SIGCOMMFriday, Aug 23rd 2002 Sundar Iyer, Rui Zhang, Nick McKeown High Performance Networking Group, Stanford University, http://yuba.stanford.edu
Contents • Background & motivation • An abstraction of a router with the pigeon-hole principle • Analysis of a FIFO work conserving router • Analysis of a PIFO work conserving router • Comparison and conclusion
Then & only then do we simulate Delay We hope customers will put up with unpredictable performance 0 100 Load We can’t guarantee, that the router is work conserving How do we design routers? We pick a capacity target & architecture
Then, we see how we can build a router We pick a capacity target & architecture We take the departure time of a packet as a requirement How can we re-design routers? We require that a router be work conserving
Motivation (A different way of thinking about routers) • This talk is about routers which are designed from the ground up. • We start with the departure times of the packets & take them as a requirement. • Then, we figure out the number of resources that we need to ensure that packets depart at their departure time.
R R C2 A1 B1 A1 B1 C-4 R R C2 C4 C1 C3 C3 Destined to output C C1 FIFO Departure Time = 4 A look at single buffered routers A router with 3 ports with memories which run at the line rate Arriving packets Packets in buffers Empty buffers Final arriving packet
Contents • Background & motivation • An abstraction of a router with the pigeon-hole principle • Analysis of a FIFO work conserving router • Analysis of a PIFO work conserving router • Comparison and conclusion
An abstract model • There are P pigeon holes which can contain an infinite number of pigeons. • Assume that time is slotted, and in any one time slot • at most N pigeons can arrive and at most N can depart. • at most one pigeon can enter or leave a specific pigeon hole. • When a pigeon arrives, we know the exact time slot at which it will depart. • For any router: • What is the minimum P, such that all N pigeons can be immediately placed in a pigeon hole when they arrive, and can depart at the right time?
Solving the abstract model • When a pigeon arrives in a time slot it must satisfy the following constraints • No more than N – 1 other pigeons arrive at that timeslot. • No more than N other pigeons depart at that timeslot. • No more than N - 1 other pigeons can depart at the same time as this pigeon departs (perhaps in future). • By the pigeon-hole principle, • 3N –1 pigeon holes are sufficient for each pigeon to arrive and leave the pigeon holes without conflict.
A take away .. If you want to build a work conserving router … • Evaluate the uniquely assigned departure time for every packet. • Define the constraints on every packet. • Apply the pigeon-hole principle. • Find the minimum resources required so that every packet departs at the right time.
Contents • Background & motivation • An abstraction of a router with the pigeon-hole principle • Analysis of a FIFO work conserving router • Analysis of a PIFO work conserving router • Comparison and conclusion
A B C E.g. 1: Parallel Shared Memory Router A router with 3 ports and k=7memories running at rate R k = 7 memories at rate R is not sufficient .. R R R R R R .. but8memories suffice
A B C How many memories for FIFO? (Parallel Shared Memory) A router with 3 ports and k=8 memories running at rate R By the pigeon-hole principle, k 3N –1 memories at rate R, suffices. R R R R R R
Buffer Buffer Buffer E.g. 2: DSM Router (Distributed Shared Memory) Switch Fabric Line Card Line Card Line Card The central memories have been moved to distributed line cards.
Why is the DSM router interesting? • The bandwidth in & out of each memory is at the line rate. • The memory is shared across inputs and outputs. • Memory and line cards can be added incrementally.
How many memories for FIFO? (Distributed Shared Memory) A router with 3 ports and k=9 memories running at rate R By the pigeon-hole principle, k 3N -1 memories at rate R, suffices. Switch Fabric Line Card Line Card Line Card
Previous Work (which use the counting/pigeon-hole principle) • Combined Input Output Queued (CIOQ) Router • Prabhakar and McKeown, -1997 • Chuang, Goel, McKeown & Prabhakar -1998 • Krishna, Patel, Charny, Simcoe –1998 • Charny - 1998 • Parallel Packet Switch (PPS) • Iyer, Awadallah & McKeown – 2000 • Iyer & McKeown - 2001
Contents • Background & motivation • An abstraction of a router with the pigeon-hole principle • Analysis of a FIFO work conserving router • Analysis of a PIFO work conserving router • Comparison and conclusion
What is the Problem with non-FIFO? …1 • First Problem: • The departure time of a cell is not fixed in policies such as strict priority, weighted fair queueing etc. • The counting technique depends on being able to predict the departure time and schedule it. • We will use an model called PIFO which captures a number of these scheduling policies for analysis
A Push in First Out (PIFO) Queue • Arriving packets are “pushed-in” to an arbitrary location in the departure queue. • Packets depart from the head of line. • The relative ordering between packets in the queue does not change.
A single PIFO queue for an output, Weights 3:2:1 C2 B3 C1 B2 B1 A3 A2 A1 B3 Enabling QoS with PIFO queues WFQ with 3 per-flow queues, destined to same output, Weights 3:2:1 A3 A2 A1 C2 C1 B2 B1 A3 A2 A1 B2 B1 C2 C1
A B C How many memories for PIFO? (Parallel Shared Memory) A router with 3 ports and k=9 memories running at rate R k = 9 memories at rate R is not sufficient 2 1 3 4 R R R R R R 2.4 2.5 .. but10memories suffice
How many memories are needed? (Distributed Shared Memory) • A cell which arrives at time t, destined to output port j must not be written to memories which • Are used to write the other N-1 arriving cells at t. • Are used to read the Ndeparting cells at t. • Will be used to read the N-1 cells of output j before t. • Will be used to read the N-1 cells of output j after t. • There are four constraint sets • By the pigeon-hole principle, 4N-2 memories at rate R, or a memory bandwidth of 4NR is sufficient
Contents • Background & motivation • An abstraction of a router with the pigeon-hole principle • Analysis of a FIFO work conserving router • Analysis of a PIFO work conserving router • Comparison and conclusion
Comparison of FIFO & PIFO (Distributed Shared Memory) FIFO PIFO BW. Of Mem. BW of fabric BW of Mem. BW of fabric Arbiter 5NR 3NR 4NR 4NR Hard DSM Router 8NR 3NR 6NR 4NR Easy 4NR 6NR 4NR 6NR Easy
Comparison with the CIOQ Router DSM CIOQ Total Memory BW 4NR 6NR Total Fabric BW 5NR 2NR • The DSM has a lower memory bandwidth than the CIOQ, which is the main bottleneck in routers. • The DSM arbiter is simpler than the CIOQ arbiter.
Conclusion • There was a class of routers which were not analyzed earlier • Distributed/Parallel shared memory • Existing techniques were not applicable • We came up with a counting technique based on the pigeon-hole principle • Single buffered routers • Combined Input-Output Queued router