Handout # 10: Packet Switching and Randomization; Load Balanced Router

Handout # 10: Packet Switching and Randomization; Load Balanced Router CSC 2203 – Packet Switch and Network Architectures Professor Yashar Ganjali Department of Computer Science University of Toronto yganjali@cs.toronto.edu http://www.cs.toronto.edu/~yganjali TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Announcements • Intermediate report • Due: Fri. Nov. 9th • Don’t leave it to the last minute! • Feel free to talk to me about your project • Today • Packet switching and randomization • Guest lecture: OpenTCP • Student presentation University of Toronto – Fall 2012

The Story • Output-Queued Switches • 100% throughput, work conserving • Need speedup N  not practical • Input-Queued Switches • Scheduling very difficult • Uniform traffic: cyclic, random, … • Non-uniform, but known: BvN • Unknown: • MWM: Not fast enough • Maximal matching • Miscellaneous Architectures and Techniques University of Toronto – Fall 2012

Today • Packet Switching and Randomization • Introduction to randomization • Randomized load balancing • Randomized switch scheduling • Load Balanced Router • Basic idea • Packet mis-sequencing University of Toronto – Fall 2012

Motivation • Networking problems suffer from the “curse of dimensionality” • Algorithmic solutions do not scale well. • Typical causes • Size: large number of users • Time: very high speeds of operation • A good deterministic algorithm exists, but … • it requires too large a data structure; • it needs state information, and “state” is too big; or • it “starts from scratch” in each iteration. University of Toronto – Fall 2012

Some Specifics • Randomized algorithms are a powerful way of approximating. • It is often possible to randomize deterministic algorithms. • This simplifies the implementation while retaining a (surprisingly) high level of performance . University of Toronto – Fall 2012

Main Idea • Randomization can simplify the decision-making process. • Since considering the complete state can be extremely difficult. • We base decisions upon a small, randomly chosen sample of the state. University of Toronto – Fall 2012

Example • Problem: Find the largest element of a set S of size 1 billion. • Age of the oldest person in China for example. • Deterministic algorithm: Search all elements linearly. • Complexity: We need to consider 1 billion cases. • Performance: Linear search will find the absolute largest element. University of Toronto – Fall 2012

Example – Cont’d • Randomized solution: Find the largest of 10 randomly chosen samples. • Complexity: We need to consider 10 samples only. • Note: We ignore the complexity of choosing 10 random samples here. • Performance: • If R is the element found by randomized algorithm, we can make statements like P(R is at least the 100 millionth largest element) = • We can say that the performance of the randomized algorithm is very good with a high probability University of Toronto – Fall 2012

Randomizing Iterative Schemes • Often, we want to perform some operation iteratively. • Example. Find the heaviest matching in a switch. • Repeated in each time slot. • However, in each time slot • at most one packet can arrive at each input; and • at most one packet can depart from each output. • The size of the queues, or the “state” of the switch, doesn’t change by much between successive time slots. • Thus, a matching that was heavy at time t will quite likely continue to be heavy at time t+1. • Knowing a heavy matching at time t should help in determining a heavy matching at time t+1. • No need to start from scratch in each time slot University of Toronto – Fall 2012

Summarizing the Philosophy… • Randomized algorithms can help simplify the implementation by reducing the amount of work in each iteration. • If the state of the system does not change by much between iterations, • We can reduce the work even further by carrying information between iterations. • The big pay-off: Even though it is an approximation, the performance of a randomized scheme can be surprisingly good. University of Toronto – Fall 2012

Load Balancing Problem – Static Case [Azar et al. 1994] • Problem. Drop N balls into N bins such that the maximum load is minimized. • Ideal policy. Drop ball into the least loaded bin. • Needs too much “state” information. • Random policy. Load a bin chosen randomly. • Maximum load is logN with high probability. • Clever random policy. Load the least loaded of d ≥ 2 randomly selected bins. • Maximum load is (log logN)/logd + O(1) with high probability. University of Toronto – Fall 2012

Load Balancing – Dynamic Case[Mitzenmacher 1996, Vvedenskaya 1996] • The supermarket model. Jobs arrive at a bank of N rate 1 exponential server queues according to a Poisson process of rate Nλ (λ< 1). • Problem. Assign jobs to servers to minimize delays. • Ideal policy. Join the shortest queue. • Random policy. Join a randomly chosen queue. • Gives N independent M/M/1 queues. • Queue length distribution P(Q ≥ i) = λi • Clever random policy. Join the shortest of d ≥ 2 randomly selected queues. • For N reasonably large, P(Q ≥ i) = University of Toronto – Fall 2012

A Simple (and Elegant) Analysis University of Toronto – Fall 2012

Continuing… University of Toronto – Fall 2012

Carrying Information Between Iterations • The load of each bin does not change by much between iterations • That is, a lightly loaded queue is likely to continue to be lightly loaded. • It might help to remember the identity of the least loaded bin in the current iteration for use in the next iteration. University of Toronto – Fall 2012

Load Balancing with Memory • The (d, 1) system • d random choices • 1 bin stored in memory • Example. Find the oldest of d people each year, and keep track of last year’s oldest person. • Question • How well does the (d, 1) system perform ? University of Toronto – Fall 2012

An Illustration • The bin stored in memory • is likely to be very lightly loaded • so we might expect better load balancing University of Toronto – Fall 2012

Theorem [Shah and Prabhakar 2001] • The maximum load achieved in the (d,1) system is less than log logN/log (2d-1) +O(1) with a high probability. • This is as if we had a (2d-1,0) system. • The bin in memory is at least as good as (d-1) samples. • Again, we see the minimum of minimums effect. University of Toronto – Fall 2012

Switch Scheduling and Graph Matching • As we have seen, switch scheduling is essentially finding matchings in weighted bipartite graphs. 4 1 4 2 2 4 University of Toronto – Fall 2012

Scheduling Algorithm • Ideal policy: Maximum weight matching • Weights: queue size, age of packets etc. • Very complex for high speed networks • In practice, approximate maximum weight matchings are the best hope. • We will discover good, randomized, approximate matchings in an evolutionary fashion. • Story told pictorially using simulations. University of Toronto – Fall 2012

Simulation Scenario • Switch Size : 32 X 32 • Input Traffic (shown for a 4 X 4 switch) • Bernoulli IID inputs • Diagonal load matrix: • Normalized load = x+y<1 • x = 2y University of Toronto – Fall 2012

Obvious Randomized Schemes • Choose a matching at random and use it as the schedule • Doesn’t give 100% throughput • Choose 2 matchings at random and use the heavier one as the schedule • Choose Nmatchings at random and use the heaviest one as the schedule None of these can give 100% throughput !! University of Toronto – Fall 2012

Queue Length vs. Load University of Toronto – Fall 2012

Bounds on Maximum Throughput University of Toronto – Fall 2012

Iterative Randomized Scheme [Tassiulas 1998] • Say M is the matching used at time t. • Let R be a new matching chosen u.a.r. • At time t+1, use the heavier of M and R. • This gives 100% throughput! • Note the boost in throughput is due to memory. • But, delays are very large University of Toronto – Fall 2012

Observations for Improvement • Most of the weight of a matching is carried in a small number of edges. • Hence, remember edges not matchings. University of Toronto – Fall 2012

Finer Observations • Let M be schedule used at time t. • Choose a “good’’ random matching R. • M’ = Merge(M, R) • M’ includes best edges from M and R. • Use M’ as schedule at time t+1. • Above procedure yields algorithm called LAURA. University of Toronto – Fall 2012

3 2 3 3 1 Merging Procedure 3 1 2 Merging 2 3 3 3-1+2-2=2 2 4 2 1 M R 2-1+2-4=-1 W(M)=12 W(R)=10 M’ W(M’)=13 University of Toronto – Fall 2012

References • L. Tassiulas, “Linear complexity algorithms for maximum throughput in radio networks and input-queued switches,” Proc. INFOCOM 1998. • D. Shah, P. Giaccone and B. Prabhakar, “An efficient randomized algorithm for input-queued switch scheduling,” Proc. of Hot Interconnects, 2001. • R. Motwani and P. Raghavan, Randomized Algorithms, Cambridge University Press, 1995. • Y. Azar et. al., “Balanced Allocations,” Proc. Of ACM STOC, 1994. • M. Mitzenmacher, “The power of two choices in randomized load balancing,” PhD Thesis, UC Berkeley, 1996. • N. D. Vvedenskaya, R. Dobrushin and F. Karpelevich, “Queueing system with selection of the shortest of two queues: An asymptotic approach,” Problems of Information Transmission, 1996. • B. Vocking, “How Asymmetry Helps Load Balancing,” Proc. Of 40th IEEE-FOCS, 1999. • S. Ethier and T. Kurtz, Markov Processes: Characterization and Convergence, John Wiley and Sons, 1986. University of Toronto – Fall 2012

The Arbitration Problem • A packet switch fabric is reconfigured for every packet transfer. • For example, at 160Gb/s, a new IP packet can arrive every 2ns. • The configuration is picked to maximize throughput and not waste capacity. • Known algorithms are probably too slow. University of Toronto – Fall 2012

Approach • We know that an IQ switch with uniform Bernoulli i.i.d. arrivals gives 100% throughput for most simple scheduling algorithms • For instance, repeatedly cycle through a fixed sequence of N different permutations • Each input-output pair (i, j) are connected exactly once in the sequence). • Can we make non-uniform, bursty traffic uniform “enough” for the above to hold? University of Toronto – Fall 2012

R R ? ? Out ? R ? R R R R R R ? R R R ? Out ? R R R ? R ? R Out Switch capacity = N2R Router capacity = NR 100% Throughput in a Mesh Fabric R In R In In University of Toronto – Fall 2012

R R/N R/N Out R/N R/N R R/N R/N Out R/N R R/N R/N Out If Traffic Is Uniform R In R In R In University of Toronto – Fall 2012

R R R R ? R/N In R R/N Out R/N R/N R R R R R In R R R/N R/N Out R/N R R R R/N In R/N Out Real Traffic is Not Uniform University of Toronto – Fall 2012

Out Out Out Out Out Load-Balanced Switch R R R In Out R/N R/N R/N R/N R/N R/N R/N R/N R R R In R/N R/N R/N R/N R/N R/N R R R R/N R/N In R/N R/N Load-balancing stage Forwarding stage 100% throughput for weakly mixing traffic (Valiant, C.-S. Chang) University of Toronto – Fall 2012

Out Out Out Load-Balanced Switch R R In 3 1 2 R/N R/N R/N R/N R/N R/N R/N R/N R R In R/N R/N R/N R/N R/N R/N R/N R R R/N In R/N R/N University of Toronto – Fall 2012

Out Out Out Load-Balanced Switch R R In R/N R/N R/N 1 R/N R/N R/N R/N R/N R R In R/N R/N 2 R/N R/N R/N R/N R/N R R R/N In R/N R/N 3 University of Toronto – Fall 2012

Out Out Out Intuition: 100% Throughput R R In R/N R/N R/N R/N R/N R/N R R R/N R/N In R/N R/N R/N R/N R/N R R/N R R/N R/N In R/N R/N Arrivals to second mesh: Capacity of second mesh: Second mesh: arrival rate < service rate University of Toronto – Fall 2012

Load Balancing 1 1 1 N N N Another Way of Thinking About It External Inputs Internal Inputs External Outputs Load-balancing cyclic shift Switching cyclic shift • First stage load-balances incoming packets • Second stage is a cyclic shift University of Toronto – Fall 2012

2 1 2 1 1 1 1 N N N Load-Balanced Switch External Inputs Internal Inputs External Outputs Load-balancing cyclic shift Switching cyclic shift University of Toronto – Fall 2012

Out 1 2 Out Out Packet Reordering R R In R/N R/N R/N R/N R/N R/N R/N R/N R R In R/N R/N R/N R/N R/N R/N R/N R R R/N In R/N R/N University of Toronto – Fall 2012

Out 1 2 Out Out Bounding Delay Difference Between Middle Ports R R In R/N R/N R/N R/N R/N R/N R/N R/N R R In R/N R/N R/N R/N R/N R/N R/N R R R/N In R/N R/N University of Toronto – Fall 2012

Handout # 10: Packet Switching and Randomization; Load Balanced Router

Handout # 10: Packet Switching and Randomization; Load Balanced Router

Presentation Transcript

HANDOUT

Handout

Handout

HANDOUT

Handout

Handout