110 likes | 230 Views
EE384x: Packet Switch Architectures I. a) Delay Guarantees with Parallel Shared Memory b) Summary of Deterministic Analysis. Nick McKeown Professor of Electrical Engineering and Computer Science, Stanford University nickm@stanford.edu http://www.stanford.edu/~nickm. Delay Guarantees.
E N D
EE384x: Packet Switch Architectures I a) Delay Guarantees with Parallel Shared Memory b) Summary of Deterministic Analysis Nick McKeown Professor of Electrical Engineering and Computer Science, Stanford University nickm@stanford.edu http://www.stanford.edu/~nickm EE384x
Delay Guarantees • Problem : • How can we design a parallel output-queued router from slower parallel memories and provide delay guarantees? • This is difficult because • The counting technique depends on being able to predict the departure time and schedule it (before, we assumed that the output queue is FCFS). • In policies such as strict priority, weighted fair queueing etc., we don’t know a cell’s departure time when it arrives. EE384x
Delay Guarantees one output, many logical FIFO queues 1 Weighted fair queueing sorts packets by finishing time constrained traffic m one output, single PIFO queue PIFO models • Weighted Fair Queueing • Weighted Round Robin • Strict priority push-in constrained traffic Push In First Out (PIFO) EE384x
Theorem A parallel output-queued router can give delay guarantees (within a bounded error) with 4N –2 memories that can perform at most one memory operation per time slot. EE384x
DT = 3 DT= 2 DT= 1 9 8 7 6 5 4 3 2 1 FIFO: Window of memories of size N-1 that can’t be used 2.5 1.5 Departure Order 8 9 8 7 7 6 6 5 4 5 4 3 3 2.5 2 2 1 1 … N-1 packets before cell at time of insertion 7 8 6 7 6 5 4 5 4 3 3 2.5 2.5 2 1.5 2 1 1 Departure Order … N-1 packets after cell at time of insertion Intuition for Theorem 2N=3 Departure Order … PIFO: 2 windows of memories of size N-1 that can’t be used EE384x
DT=t Before C After C • Used to read the N-1 cells that depart before it. • Used to read the N-1 cells that depart after it. Proof Time = t A packet cannot use the memories: • Used to write the N-1 arriving cells at t. • Used to read the Ndeparting cells at t. DT=t+T Cell C EE384x
a2’ c4 b4 a4 c3 b3 a3 c2 b2 a2 c1 b1 a1 c4 b4 a3 c3 b3 a2 c2 b2 a2’ c1 b1 a1 Relative order of (a3,b3) reversed after being placed in memory Therefore, departure is not in PIFO order. By how much can the order differ? With a PIFO per output DT = 4 DT = 3 DT= 2 DT= 1 c4 b4 a4 c3 b3 a3 c2 b2 a2 c1 b1 a1 EE384x
Nk N2 N1 ck c2 c1 bk b2 b1 ak a2 a1 a2’ Nk N2 N1 ck c2 c1 bk b2 b1 ak a2 a1 Nk N2 N1 ck c2 c1 bk b2 b1 a(k-1) a2’ a1 Cells are correctly resequenced by each output. Therefore, maximum delay is k-1 time slots. Permute departure order DT = k DT = 3 DT= 2 DT= 1 Nk bk ak N3 b3 a3 N2 b2 a2 N1 b1 a1 EE384x
Input Queued - Crossbar N 2R 2NR NR Nk 2NR/k 2NR 2NR - Summary - Routers with delay guarantees Switch Algorithm Total MemoryBW Switch BW Fabric # Mem. Mem. BW Output-Queued Bus N (N+1)R N(N+1)R NR None Shared Mem. Bus 1 2NR 2NR 2NR None 2N 3R 6NR 2NR Marriage CIOQ (Cisco) Crossbar Time Reserve 2N 3R 6NR 3NR PSM Bus k 4NR/k 4NR 4NR C. Sets DSM (Juniper) N 4R 4NR 5NR Edge Color Xbar N 4R 4NR 8NR C. Sets N 6R 6NR 6NR C. Sets PPS - OQ Clos Nk 3R(N+1)/k 3N(N+1)R 6NR C. Sets Nk 6NR/k 6NR 6NR C. Sets PPS –Shared Memory Clos EE384x
Summary of OQ Switches • Output queued switches are ideal • Work-conserving. • Maximize throughput. • Minimize expected delay (for fixed length packets). • Permit delay guarantees for constrained traffic. • Output queued switches don’t scale well • Requires N memory writes per time slot. • Memory bandwidth (dictated by the random-access time of a memory) is a bottleneck. • Parallelism is not straightforward. EE384x
Summary of OQ Switches (2) • Parallelizing packet switches has problems • Resource conflicts. • Packet mis-sequencing. • Methods to analyze parallel OQ switches • Constraint Sets (based on pigeon-hole principle) • Parallel packet switches • Parallel shared memory • Distributed shared memory • Extension to PIFO • Parallel packet buffers • Hybrid SRAM-DRAM FIFO queues. • With and without lookahead buffer. EE384x