180 likes | 322 Views
Approximating Min-Max k-clustering. Asaf Levin The Hebrew University Israel. Problem motivation. We need to partition a set E of clients to k (identical) servers. Each server pays the service cost of the set of clients that is allocated to it.
E N D
ApproximatingMin-Maxk-clustering Asaf Levin The Hebrew University Israel
Problem motivation • We need to partition a set E of clients to k (identical) servers. • Each server pays the service cost of the set of clients that is allocated to it. • We would like to find an allocation of the clients to servers so that the maximum payment of a server will be minimized. That is the definition of fairness that we consider.
Research methodology • Approximation algorithms: • Efficient algorithms that are not provably optimal. • A ρ -approximation algorithm is a polynomial time algorithm which always produces a (feasible) solution of cost at most ρ times the cost of an optimal solution. • ρ is called the approximation ratio.
Generalized cost variant • We assume that the cost function of the servers c:2ER+ is given by an oracle. • Such an oracle can correspond to solving an NP-hard problem or performing a simulation study of some engineering problem. • Each oracle call takes a constant time. • We assume that c satisfies: • The cost of a singleton is zero. • c is monotone non-decreasing. • If S and S’ intersect, then c(S)+c(S’ ) c(S S’ ).
An Auxiliary problem: k-Clusters • Input: a complete undirected graph G=(V,A) with a metric edge costs l. • Goal: partition the vertex set into k subsets V1,V2,…,Vk so as to minimize maxi=1,2,…,k max u,v in Vi l(u,v). • The k-Clusters is a special case of our generalized cost variant- c(S)=max u,v in S l(u,v). • The k-Clusters has 2-approximation algorithms (Gonzales, Hochbaum&Shmoys).
Our results on the generalized cost • A 2-approximation algorithm if k=2. • ((k-1)α+1)–approximation for larger values of k, where α=2 is the approximation ratio for the k-Clusters problem. • A lower bound of k for all values of k≥2.
Approximation algorithm • Compute a cost function c‘: c'(S) =max i,j in S c({ i,j } ) . • Using an α-approximation algorithm for the k-Cluster problem (for k=2 we have α=1, and for larger values of k α=2), find a partition of E into k subsets such that its cost with respect to c' is (approximately) minimized. • Theorem: This is a ((k-1)α+1)–approximation.
Proof (sketch) • c’ is a metric → there is a 2-approximation for k-Clusters. • Define the following graph G’ : • The vertex set is the cluster set of OPT. • Two clusters I,J of OPT are adjacent if there is a cluster C in APX that intersects both I and J. • For every edge (I,J) in G’, we can replace I and J by their union while increasing the cost of the cluster by at most c’(APX). • We apply this procedure for all edges in each connected component of G’ and we get the result (APX is a refinement of the resulting partition).
Lower bound • Ground set with pk2 elements and OPT is a partition into k equal size sets. • Where n(S) is the number of optimal clusters that intersect S.
Lower bound (cont.) • c satisfies the properties and OPT costs 1. • If there is a k-1 approximation, then it needs to find a set S with ≥ pk elements and n(S) ≤k-1. • Assume the partition of OPT is random (uniformly). • The probability that a given set S will intersect at most k-1 clusters of OPT is ≤ k(1-1/k)p. • Unsuccessful trials do not help to find such set. • Hence the expected number of steps is ≥ 1/(k(1-1/k)p). This is exponential in n → contradiction.
Special cases of the generalized cost variant • Motivation: • Sometimes the structure of the cost function c is known. • For this cases there are better approximation algorithms. • Here we present special cases resulting in vehicle routing problems (joint work with Esther Arkin and Refael Hassin).
Min-max rural postmen cover • Input: A complete graph G=(V,E), a length function l of the edges, a subset E’ of the edges and k. • Output: k paths that cover all the edges of E’. • Goal: minimize the maximum length of a path in the solution. • Our result: 7-approximation algorithm (independent of k).
Example Edge in E’ Edge in E\E’
Algorithm • Guess the optimal cost λ*. • Let C1,C2,… be the connected components of the graph obtained from G after deleting all edges of length > λ*. • For all i, let Pi be the (approximated) solution to the rural postman problem on Ci (one postman), and let • If ∑i K i >k then return that λ* is too small. • Otherwise, cut each path Pi into Ki paths (each of length at most 7λ*). Note that the last edge of each resulting path increase its length by at most λ*.
Proof (sketch) • If λ* is not too small, then each path of OPT covers edges from at most one connected component. • Consider Ci and assume that OPT has K*i <Ki paths that cover the edges of E’ from Ci. • Then using another K*i-1 edges, each of length ≤ λ* we get a connected subgraph of total length ≤ (2K*i-1) λ*.
Proof sketch (cont.) • Double the edges of this graph, and get an Eulerian graph of length < 4K*iλ*. • The approximation ratio of the rural postman problem is 3/2, and hence l(Pi)< 6K*iλ*. • Therefore, Ki ≤K*i .