1 / 19

Geographically Distributed Datacenters with Load Reallocation

Explore load reallocation in geographically distributed datacenters to enhance elasticity and reduce latency for high-performance services. Toy examples, formal models, distributed algorithms, and experimental results are discussed in a large-scale topology framework. Discover insights on load balancing and achieving optimal resource utilization in datacenter operations.

philipb
Download Presentation

Geographically Distributed Datacenters with Load Reallocation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Geographically Distributed Datacenterswith Load Reallocation Indra Widjaja, Sem Borst, Iraj Saniee Bell Labs DIMACS Workshop on Cloud Computing, December 8-9, 2011

  2. Datacenter Alternatives 2 1 2 1 3 3 5 4 5 4 Geographically Centralized: Geographically Distributed: = Servers = Potential DC Site

  3. Challenge Centralized datacenters cannot uniformly offer low-latency services to all end-users Distributed datacenters may not achieve elasticity

  4. Toy Example of Distributed DC with Reallocation Without reallocation: With reallocation: λ1 λ1 q1,1 1 1 m1 m1 q1,3 3 3 2 2 5 5 4 4 • λi = job arrival rate at site i , mi = processing capacity at site i • qi,j = fraction of load reallocated from site i to site j

  5. Formal Model of Load (Re)Allocationin Geographically Distributed Datacenter Let lik be arrival rate of type-k jobs at site i, bk service time of type-k job per server, and ti,j round-trip delay between sites i and j. The optimization problem to solve is: weighted average delay fraction of load at i sent to j st normalized exogenous arrival rate at i where total exogenous arrival rate at all sites total arrival rate at site j utilization at site j with Kj servers average processing delay with multiple-server approx.

  6. Toy Example of Distributed DC with Reallocation 0.907 0 0.093 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0.093 0 0.907 Q = λ1=2 1 1 λ3=1 λ2=1.5 2 3 1 1 1 1 λ5=2 4 5 λ4=1.5 Weighted Delay = 0.7842

  7. Large-Scale Topology 32-node, 44-link network used in the experiment: SEA SEA 11 SAI SAI 6 BUF BUF MIL MIL 2 2 ALB ALB DET DET 1 1 1 1 CLE CLE BOS BOS 1 2 5 2 1 CHI CHI 1 1 1 2 SPR SPR 1 NYC NYC 2 PIT PIT 3 PHI PHI 4 2 2 DEN DEN SAL SAL 1 BAL BAL KAN KAN 4 CIN CIN 1 2 SFO SFO 3 1 WAS WAS 4 LAS LAS NAS NAS 2 2 3 5 RAL RAL 1 3 2 PHO PHO 3 4 ATL ATL LOS LOS 1 ELP ELP 5 NOR NOR JAC JAC 1 HOU HOU 3 3 TAM TAM 2 MIA • Each link is associated with delay tij. • The centralized datacenter is located in CHI

  8. Comparison of Delays 1.1l, if i is odd 0.9l, if i is even 1.5l, if i is odd 0.5l, if i is even li= li= Nearly-uniform job arrival rates: Non-uniform job arrival rates: mi =1 for all i

  9. Comparison of Elasticities Moderate load variation: High load variation: In each trial, li=Uniform(0.25, 1) for moderate load variation for each i li=Uniform(0, 1.5) for high load variation for each i Then rescale li such that system-wide utilization is fixed (to 0.5). mi = 1 for each i

  10. Multiple Job Types Type-independent: jobs are reallocated from i to j with qi,j fraction regardless of their types Type-dependent: type-k jobs are reallocated from i to j with qki,j Example with 2 job types:

  11. Distributed Algorithms for Load Reallocation Basic idea: Each site icomputes impact on global objective function as it sends an additional small fraction of jobs to each site j, i.e., Min-rule: site i determines site jmin(i) such that ai,jmin(i) is the minimum derivative. It then reallocates loads from other sites to site jmin. Max-rule: site i determines site jmax(i) such that ai,jmax(i) is the maximum derivative. It then reallocates loads from site jmax to other sites.

  12. Distributed Algorithm with “min-rule” At site i: Compute gi,j = ai,j - ai,jmin(i)for all j  Ni, compute gi= ∑jNi, j ≠jmin(i)gi,j, and d=min{k, (1-rjmin(i)) Kjmin(i)/(libgi)} where jmin(i) = argminjNiai,j At site i: Evaluate hi,j = min{qi,j, dgi,j}for all j ≠ jmin(i), jNi, and hi,jmin(i) = - ∑j≠jmin(i), jNihi,j At site i: Update qi,j= qi,j-hi,jfor all jNi, qi,j=0, for jNi Collect new measurement and go to next site (e.g., i=i+1 mod N) No Converged? Yes Detect changes in delay and utilization

  13. Distributed Algorithm with “max-rule” At site i: Compute gi,j = max{ai,jmax(i) - ai,j, 0}for all jNi and compute nij = (1-rj) Kj/(lib), for allj ≠ jmax(i), j Ni, where jmax(i) = argmaxj:qi,j>0ai,j At site i: Compute d = min{k, qi,jmax(i)/ ∑jNigi,j} Evaluate hi,j = min{nij, dgi,j}for all j ≠ jmax(i), j Ni, and hi,jmax(i) = - ∑j≠jmax(i),jNihi,j At site i: Update qi,j= qi,j+ hi,jfor all jNi, qi,j=0, for jNi Collect new measurement and go to next site (e.g., i=i+1 mod N) No Converged? Yes Detect changes in delay and utilization

  14. Scenario 1: Load Increases by 50% at One Site

  15. Scenario 2: Load Increases by 100% at One Site

  16. Scenario 3: Load Increases by 200% at One Site

  17. Scenario 4: Two Back-to-Back Overloaded Sites

  18. Scenario 5: Noisy versus Perfect Measurements

  19. Conclusions and Further Work Load reallocation provides key instrument for achieving elasticity and reducing latency simultaneously Only considered processing-intensive applications so far; other applications will be considered in further work

More Related