190 likes | 210 Views
Explore load reallocation in geographically distributed datacenters to enhance elasticity and reduce latency for high-performance services. Toy examples, formal models, distributed algorithms, and experimental results are discussed in a large-scale topology framework. Discover insights on load balancing and achieving optimal resource utilization in datacenter operations.
E N D
Geographically Distributed Datacenterswith Load Reallocation Indra Widjaja, Sem Borst, Iraj Saniee Bell Labs DIMACS Workshop on Cloud Computing, December 8-9, 2011
Datacenter Alternatives 2 1 2 1 3 3 5 4 5 4 Geographically Centralized: Geographically Distributed: = Servers = Potential DC Site
Challenge Centralized datacenters cannot uniformly offer low-latency services to all end-users Distributed datacenters may not achieve elasticity
Toy Example of Distributed DC with Reallocation Without reallocation: With reallocation: λ1 λ1 q1,1 1 1 m1 m1 q1,3 3 3 2 2 5 5 4 4 • λi = job arrival rate at site i , mi = processing capacity at site i • qi,j = fraction of load reallocated from site i to site j
Formal Model of Load (Re)Allocationin Geographically Distributed Datacenter Let lik be arrival rate of type-k jobs at site i, bk service time of type-k job per server, and ti,j round-trip delay between sites i and j. The optimization problem to solve is: weighted average delay fraction of load at i sent to j st normalized exogenous arrival rate at i where total exogenous arrival rate at all sites total arrival rate at site j utilization at site j with Kj servers average processing delay with multiple-server approx.
Toy Example of Distributed DC with Reallocation 0.907 0 0.093 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0.093 0 0.907 Q = λ1=2 1 1 λ3=1 λ2=1.5 2 3 1 1 1 1 λ5=2 4 5 λ4=1.5 Weighted Delay = 0.7842
Large-Scale Topology 32-node, 44-link network used in the experiment: SEA SEA 11 SAI SAI 6 BUF BUF MIL MIL 2 2 ALB ALB DET DET 1 1 1 1 CLE CLE BOS BOS 1 2 5 2 1 CHI CHI 1 1 1 2 SPR SPR 1 NYC NYC 2 PIT PIT 3 PHI PHI 4 2 2 DEN DEN SAL SAL 1 BAL BAL KAN KAN 4 CIN CIN 1 2 SFO SFO 3 1 WAS WAS 4 LAS LAS NAS NAS 2 2 3 5 RAL RAL 1 3 2 PHO PHO 3 4 ATL ATL LOS LOS 1 ELP ELP 5 NOR NOR JAC JAC 1 HOU HOU 3 3 TAM TAM 2 MIA • Each link is associated with delay tij. • The centralized datacenter is located in CHI
Comparison of Delays 1.1l, if i is odd 0.9l, if i is even 1.5l, if i is odd 0.5l, if i is even li= li= Nearly-uniform job arrival rates: Non-uniform job arrival rates: mi =1 for all i
Comparison of Elasticities Moderate load variation: High load variation: In each trial, li=Uniform(0.25, 1) for moderate load variation for each i li=Uniform(0, 1.5) for high load variation for each i Then rescale li such that system-wide utilization is fixed (to 0.5). mi = 1 for each i
Multiple Job Types Type-independent: jobs are reallocated from i to j with qi,j fraction regardless of their types Type-dependent: type-k jobs are reallocated from i to j with qki,j Example with 2 job types:
Distributed Algorithms for Load Reallocation Basic idea: Each site icomputes impact on global objective function as it sends an additional small fraction of jobs to each site j, i.e., Min-rule: site i determines site jmin(i) such that ai,jmin(i) is the minimum derivative. It then reallocates loads from other sites to site jmin. Max-rule: site i determines site jmax(i) such that ai,jmax(i) is the maximum derivative. It then reallocates loads from site jmax to other sites.
Distributed Algorithm with “min-rule” At site i: Compute gi,j = ai,j - ai,jmin(i)for all j Ni, compute gi= ∑jNi, j ≠jmin(i)gi,j, and d=min{k, (1-rjmin(i)) Kjmin(i)/(libgi)} where jmin(i) = argminjNiai,j At site i: Evaluate hi,j = min{qi,j, dgi,j}for all j ≠ jmin(i), jNi, and hi,jmin(i) = - ∑j≠jmin(i), jNihi,j At site i: Update qi,j= qi,j-hi,jfor all jNi, qi,j=0, for jNi Collect new measurement and go to next site (e.g., i=i+1 mod N) No Converged? Yes Detect changes in delay and utilization
Distributed Algorithm with “max-rule” At site i: Compute gi,j = max{ai,jmax(i) - ai,j, 0}for all jNi and compute nij = (1-rj) Kj/(lib), for allj ≠ jmax(i), j Ni, where jmax(i) = argmaxj:qi,j>0ai,j At site i: Compute d = min{k, qi,jmax(i)/ ∑jNigi,j} Evaluate hi,j = min{nij, dgi,j}for all j ≠ jmax(i), j Ni, and hi,jmax(i) = - ∑j≠jmax(i),jNihi,j At site i: Update qi,j= qi,j+ hi,jfor all jNi, qi,j=0, for jNi Collect new measurement and go to next site (e.g., i=i+1 mod N) No Converged? Yes Detect changes in delay and utilization
Conclusions and Further Work Load reallocation provides key instrument for achieving elasticity and reducing latency simultaneously Only considered processing-intensive applications so far; other applications will be considered in further work