1 / 18

The Load Distance Balancing Problem

The Load Distance Balancing Problem. Eddie Bortnikov (Yahoo!) Samir Khuller (Maryland) Yishay Mansour (Google) Seffi Naor (Technion). The Load-Distance Balancing Problem. Given n clients and k servers s 1 , s 2 ,…s k we need to assign each client to a server.

johana
Download Presentation

The Load Distance Balancing Problem

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Load Distance Balancing Problem Eddie Bortnikov (Yahoo!) Samir Khuller (Maryland) Yishay Mansour (Google) Seffi Naor (Technion)

  2. The Load-Distance Balancing Problem • Given n clients and k servers s1, s2,…sk we need to assign each client to a server. • Cost for client i assigned to server sj is as follows: Cost(i) = Distance(i,sj) + Delay(j,Lj) Delay(j,Lj) is a FUNCTION of the number of clients Lj assigned to sj. OBJECTIVE: Min Max Cost(i)

  3. An Example s2 s1 s3 C E D A B • Each server has its own delay function (can be arbitrary, we just assume its non-decreasing). • Note that C is closer to s1, but prefers to attach to s2 since Dist(C,s2)+Delay(s2,2)<Dist(C,s1)+Delay(s1,3) • Objective is to Minimize Max Cost for any client

  4. Related Work • Lots of research on locating facilities. Here the facilities are all given; we just have to compute assignment of clients to facilities. • Notion of capacities has been used for various covering problems such as Vertex Cover, K Centers, Facility Location etc.

  5. Main Results • The problem is NP-hard. • We develop a polynomial time 2 approx. • We show that the bound 2 cannot be improved to 2-ε unless NP=P. • With triangle inequality in the distance function the hardness reduces to 5/3-ε. • When all clients and servers are on a line its solvable in polynomial time. • For Min Sum Cost(i), we can solve in polynomial time using Min-Weight Matching.

  6. NP-hardness by Exact Set Cover • Given N elements and a collection S of K sets (each set has size m). Does there exist a subset S’ of S, such that each element belongs to exactly one set in S’? In other words, we need to pick exactly N/m subsets from S, to cover each element once.

  7. Example of Exact Cover • Here we have N=16 elements and 9 sets (m=4). • The FOUR blue sets form an EXACT COVER, and we discard the FIVE orange sets.

  8. Reduction from Exact Cover • Each element is a client. In addition we create a collection of M(K-N/m) dummy clients. • Subset Sj in S corresponds to server sj. • Dist(dummy,server)=d1 • Dist(i,sj) = d2 if i ε Sj, o.w. ∞ • d2 >> d1 d1 Dummy clients sj d2 i Clients (elements)

  9. Reduction from Exact Cover • Delay functions for servers are basically a step function. • Delay(j,Lj) = Δ-d2, when load is at most m. • Delay(j,Lj)=Δ-d1, when load exceeds m, but is at most M. Δ-d1 Δ-d2 m M

  10. Reduction from Exact Cover • Suppose there is a solution to exact cover, then there is a solution to the LDB problem with delay at most Δ. • For each chosen subset Sj, the corresp. server sj gets m clients each at distance d2.Since the delay is Δ-d2, the total cost is at most Δ. • For the remaining subsets, those dummies are all assigned to the remaining servers (K-N/m), each gets M dummies. • The proof in the other direction requires some work!

  11. Proof (cont.) • Each server can support at most M dummy clients if the total cost does not exceed Δ, and no more than m real clients. • Suppose a server supports both real and dummy clients; then the total number of servers with real clients is k’ > N/m. • These serve at most (mk’-N) dummy clients, while the rest can serve only M(k-k’) dummy clients. • Adding the two shows (some algebra needed) that we can only assign < M(k-N/m) dummy clients if M>m.

  12. Hardness Results follow…. • We can set d1=ε and d2=Δ-ε. A solution to EXACT COVER exists if and only if a solution with cost Δ exists for LDB. • If there is no solution to EXACT COVER, then every solution to LDB has cost 2(Δ-ε). • However, here we do violate triangle inequality in the distance function.

  13. Hardness results with triangle inequality • We need to set d1=⅓ Δ, and d2= Δ-ε. • With these parameters, the distance between a (real) client i and a server sj such that i is not in Sj, is at least 5/3Δ- ε. • A solution to EXACT COVER exists if and only if a solution with cost Δ exists for LDB. • If there is no solution to EXACT COVER, then every solution to LDB has cost 5/3Δ-ε.

  14. Approximation Algorithm with factor 2 • Suppose a solution exists with maximum cost Δ. • For each server sj, we can compute an upper bound on the number of clients that can be served with a delay of at most Δ (say L*j). • For each client i we can compute the subset of servers that are within distance Δ (S*i). • Now its just a flow problem to check if an assignment exists where each client i is assigned to a server from S*i and each sj has load at most L*j. • Minimizing Δ gives a trivial 2 approximation.

  15. All servers and clients on a line • Use dynamic programming!

  16. Minimizing the Sum of Costs • We reduce this to min cost matching in a bipartite graph. • Let G=(X,Y,E) where nodes in X correspond to n clients and there are nk nodes in Y. We have n nodes corresp. to each server. • We ask for a min cost matching to find a solution.

  17. Capacitated K Centers • The related problem of choosing K facilities has been considered (each client should be assigned to a closeby facility and the load on the facility should not be too high): [Khuller & Sussman] (K,L,5R) or ((2/c)K, cL, 2R) related to K-centers clustering.

  18. Conclusions • Can we improve the 2 approximation when triangle inequality holds? • Can we improve the 2 approximation when Delay functions satisfy specific properties? What is a natural delay function? • Are there other special cases that can be solved in polynomial time?

More Related