330 likes | 455 Views
Packet-Mode Emulation of Output-Queued Switches. David Hay, CS, Technion Joint work with Hagit Attiya (CS, Technion), Isaac Keslassy (EE, Technion). CIOQ Switches. Cell-Mode Scheduling. Cell-Mode Scheduling. Cell-Mode Scheduling. Trend towards Packet-Mode.
E N D
Packet-Mode Emulation of Output-Queued Switches David Hay, CS, Technion Joint work with Hagit Attiya (CS, Technion), Isaac Keslassy (EE, Technion)
Trend towards Packet-Mode • Cell-mode scheduling is getting too hard • Fragmentation and reassembly should work very fast, at the external rate • Extra header for each cell loss of bandwidth • For optical switches such fragmentation and reassembly are prohibitive • Cell-mode schedulers are packet-oblivious • Degradation of the overall performance
Packet-Mode Scheduling [Marsan et al., 2002][Ganjali et al., 2003][Turner, 2006] • No need for fragmentation and reassembly • Must ensure contiguous packet delivery over the fabric • While input i delivers a packet to output j, neither input i nor output j can handle other packets. Can packet-mode schedulers provide similar performance guarantees as cell-mode schedulers?
Output Queuing Emulation • OQ switches are considered optimal with respect to queuing delay and throughput • But too hard to implement in practice… • Emulation: Same input traffic same output traffic • How hard is it for cell-mode / packet-mode CIOQ switch to emulate OQ switch?
Output Queuing Emulation • OQ switches are considered optimal with respect to queuing delay and throughput • But too hard to implement in practice… • Emulation: Same input traffic same output traffic • How hard is it for cell-mode / packet-mode CIOQ switch to emulate OQ switch?
Cell-Mode Emulation is Possible • Easy with speedup S=N • N scheduling decisions every time-slot: • In the 1st decision forward the cell of input 1 • In the 2nd decision forward the cell of input 2 • In the Nth decision forward the cell of input N • Possible with speedup S2: CCF algorithm • Lower bound: S≥2-1/N is required [Chuang et al.,1999] What is the speedup required for packet-mode emulation?
Packet-Mode Emulation is Impossible • Regardless of speedup • Even with speedup S=N
Emulation w/ Relative Queuing Delay • The CIOQ switch is allowed a bounded lag behind the shadow OQ switch • Exact same behavior as the optimal OQ switch, but with some extra delay • Called relative queuing delay Can we provide packet-mode OQ emulation with bounded RQD and small speedup?
Our Results:Speedup-RQD tradeoff Speedup 2Lmax First algorithm: S 4 with RQD=O(NLmax) Generalization of cell-mode scheduling with S=2: Taking each packet of size ≤ Lmax as one huge cell Lower bound on RQD (even with infinite speedup) Lmax=maximum packet size 4 2 RQD Lower bound on the speedup (from cell-mode scheduling)
Intuition for Emulation Algorithms Packet Mode CIOQ Cell Mode CIOQ w/ S=2 Packet Mode OQ
Underlying CCF Algorithm • Observation: Packet-Mode OQ switch is a Cell-Mode OQ switch with different queuing discipline (called PIFO) • Cell-Mode CIOQ w/ CCF (and speedup S=2) emulates any PIFO cell-mode OQ switch [Chuang et al.,1999] • But, CCF does not maintain contiguous packet forwarding over the fabric! Packet Mode CIOQ Cell Mode CIOQ w/ S=2 PIFO Cell-Mode OQ = Packet Mode OQ
Intuition for Emulation Algorithms Packet Mode CIOQ • Two sub-steps: • Framing • Contiguous Decomposition Cell Mode CIOQ w/ S=2 Packet Mode OQ
time Frame-Based Schedulers Works in pipelined frame-based manner Within each frame: • Builda demand matrix for this frame • Schedule the demand matrix of the previous frame
≤2T + + + + + + + ≤2T + + + + + + + ≤2T + + + + + + + ≤2T + + + ≤ ≤ ≤ ≤ 2T 2T 2T 2T Building the Demand Matrix • At each frame of size T, CCF forwards at most 2T cells from each input and to each output. Number of cells CCF sent from input 1 to output 1 in the last frame Problem: A packet may span several frames.
Building the Demand Matrix • Count only packets whose last cell is forwarded by the CCF in the frame • Each row/column in the matrix is bounded by 2T+N(Lmax-1) • For each input-output pair only cells of one additional packet can be added. • Translates into RQD of 2T+Lmax-2.
Intuition for Emulation Algorithms Packet Mode CIOQ • Two sub-steps: • Framing • Contiguous Decomposition Cell Mode CIOQ w/ S=2 Packet Mode OQ
Decomposing the Demand Matrix • Challenge: Decompose the matrix into permutations while maintaining contiguous packet delivery. • Each permutation dictates a scheduling decision. • Speedup = Number of permutations/Frame Length • First try: optimal Birkhoff von-Neumann decomposition results in 2T+N(Lmax-1) permutations.
Cells left from 1 to 1 Iteration t-1 • Speedup: RQD: 2T+Lmax-2 Contiguous Greedy Decomposition • To maintain contiguous packet delivery: • If (i,j) was matched in iteration t-1 and there are more (i,j) cells to schedule keep for iteration t. • Find a greedy matching for the rest of the matrix. Iteration t
Our Results:Speedup-RQD tradeoff Speedup 2Lmax S=4+ (2N(Lmax-1)-1)/T RQD = 2T+Lmax-2 Next… 4 2 RQD
Packet-Mode Emulation w/ S2 Packet Mode CIOQ • Separate demand matrix for every possible packet size • Concatenate packets of the same size into mega-packets of size k=LCM(1,…,Lmax) • Leftover matrix for each size m • Two sub-steps: • Framing • Contiguous Decomposition Cell Mode CIOQ w/ S=2 Packet Mode OQ
Packet-Mode Emulation w/ S2 Packet Mode CIOQ • Optimally decompose (w/ Birkhoff von-Neumann) • the mega-packets matrix • then the leftover matrices • Two sub-steps: • Framing • Contiguous Decomposition Cell Mode CIOQ w/ S=2 Packet Mode OQ
Wrap-up Packet-mode scheduling can be done with the same speedup as cell-mode scheduling • With the price of bounded RQD • Future work: lower bounds ??