Join-Idle-Queue: A Novel Load Balancing Algorithm for Dynamically Scalable Web Services

Join-Idle-Queue: A Novel Load Balancing Algorithm for Dynamically Scalable Web Services Li Yu, QiaominXie, Gabriel Kliot, Alan Geller, James R. Larus, Albert Greenberg IFIP Performance 2011 Best Paper Presented by Amir Nahir

Agenda • Queuing terminology and background: • M/M/N, Processor Sharing • Motivation • The algorithm • Some analysis • Results • Caveat: where has time gone?

M/M/N • Single shared queue • Jobs wait in the queue • Whenever a server completes one job, it gets the next from the queue

M/M/N • Pros: • Jobs arrive to the next server to become available • Cons: • Centralized (single point of failure, bottleneck) • Hidden overhead – the time is takes to access the queue to get the job

Service Disciplines • FIFO (a.k.a FCFS) • Processor sharing

FIFO vs. Processor Sharing • Analysis is very similar • But results are not quite the same • E.g., assume three jobs arrive at the system at time 0 • PS is currently seen as the “realistic” model Avg. time in system = 2 Avg. time in system = 3

Web-Services: The User’s Experience • No one really tries to model the whole process as a single problem • Common component (unchanged by the research) are often neglected Server Scheduler

The Main Motivation for the Paper • Reduce delays on the job’s critical path Server Scheduler Server Scheduler Scheduler

The Join-Idle-Queue Algorithm: System Structure • Two-layer system: dispatchers (front-ends) and processors (back-ends, servers) • The ratio between servers and dispatchers is denoted by r • No assumptions regarding processor discipline (can support PS, FIFO) • Each dispatcher has an I-queue • The I-queue holds servers (not jobs)

The Join-Idle-Queue Algorithm: Dispatcher Behavior • Upon receiving a job from user: • If there are servers in the I-queue, dequeue first server and send job to it • Otherwise – send job to random server • This deteriorates system performance • This is termed primary load balancing

The Join-Idle-Queue Algorithm: Server Behavior • Upon completing all jobs: • Choose a dispatcher • Two techniques are considered: Random and SQ(d) • Register in its I-queue • This is termed secondary load balancing

The Join-Idle-Queue Algorithm at Work 1 2 4 1 2 3 4

The Join-Idle-Queue Algorithm: Corner Case 1 Server 2 is busy processing a job while being registered as “idle” in one of the I-queues 2 1 2 3 4

The Join-Idle-Queue Algorithm: Corner Case 2 2 2 Server 2 is reported as “idle” in more than one dispatcher 1 2 3 4

JIQ Analysis: Some Notations • r – the ratio of servers to dispatchers • When is the algorithm expected to perform better, large r or small r?

JIQ Analysis: Some Notations • pi – the probability that a server holds exactly i jobs • p0 – the probability that a server is idle • λi – the arrival rate of jobs to a server which holds exactly i jobs • λ0– the arrival rate of jobs to idle processors • ρ=λ/μ • Common notation in queuing

Load Balancing Assertions • No matter how your balance the load: • p0 = 1 – λ • a λN dispatchers n servers

Load Balancing: So Where Does the Wisdom Go? • It’s not about: increasing the probability that a server is idle • It’s about increasing the arrival rate to idle (and lightly loaded) servers • And from there,

Theorem 1: Proportion of Occupies I-Queues • There’s a strong connection between idle servers and occupied I-queues • Jobs arrive at the system at rate λn • The proportion of idle servers is (1-λ)n • This proportion is equally distributed among the dispatchers, so the proportion of occupied I-queues is (1-λ)n/m = (1-λ)r

Theorem 1: Proportion of Occupies I-Queues • On the other hand, the authors show that “server arrivals” to the I-queue do behave like a Poisson process (when n→∞) • Servers arrive at I-queues at rate ρ • There are ρmoccupied I-queues (on average) • And so the average I-queue length, under random secondary load balancing, is:

Corollary 2: The Arrival Rate at Idle Servers (1) • Job arrival rate at the specific dispatcher is λn/m • A job has probability ρ to find an occupied I-queue • Average I-queue length is r(1-λ) 2 1 2 3 4

Corollary 2: The Arrival Rate at Idle Servers (2) • Job arrival rate at servers is λ • A job has probability (1-ρ) to find an empty I-queue • Overall arrival rate at idle servers 2 1 2 3 4

Corollary 2: The Arrival Rate at Non-Idle Servers • Job arrival rate at servers is λ • A job has probability (1-ρ) to find an empty I-queue • Arrival rate at busy servers is λ (1-ρ) 2 1 2 3 4 The arrival rate at idle servers is (r+1) times higher than the arrival rate at non-idle servers

Proportion of Empty I-queues

Results (Exponential Job Length)

Job Length Distributions

Sensitivity to Variance (PS)

Affect of r on Performance

Caveat: Scheduling Still Takes Time… • When the decision for the secondary load balancing takes place, the servers is not registered at any I-queue • At this time, performance is expected to degrade…. Scheduler Server Scheduler

Join-Idle-Queue: A Novel Load Balancing Algorithm for Dynamically Scalable Web Services