210 likes | 518 Views
Heuristics for the Hidden Clique Problem. Robert Krauthgamer (IBM Almaden) Joint work with Uri Feige (Weizmann). Max-Clique. Given a graph on n vertices, find a clique (complete subgraph) of maximum size. Equivalently, find a maximum stable set (induced empty graph)
E N D
Heuristics for the Hidden Clique Problem Robert Krauthgamer (IBM Almaden) Joint work with Uri Feige (Weizmann)
Max-Clique • Given a graph on n vertices, find a clique (complete subgraph) of maximum size. • Equivalently, find a maximum stable set (induced empty graph) • Not approximable within ratio n1- for every fixed >0, unless NP has randomized polynomial time algorithms [Hastad’96]. Heuristics for the Hidden Clique Problem
Hidden/planted clique model • Suggested by [Jerrum’92], [Kucera’95]: 1. Generate a random graph Gn,1/2. • Every two vertices connected by edge w/ probability ½. 2. Plant a clique of size k. (kÀ2log n) • Placed on k randomly chosen vertices. • WHP it is a unique maximum clique. • Goal: Efficiently find the maximum clique with high probability (WHP) over inputs. Heuristics for the Hidden Clique Problem
Why is this problem interesting? • A heuristic is an algorithm that need not always output an optimal solution (no worst-case guarantee). • Can evaluate heuristics: • Experimental performance – benchmarks • Performance guarantees – families of inputs on which a heuristic performs well • This talk – random and semi-random inputs. • Similar in spirit to Smooth Analysis [Spielman-Teng] • Average-case hardness [Levin] Heuristics for the Hidden Clique Problem
Known results (Gn,1/2 + k-clique) • Motivation: No planted clique – finding maximum clique in a random graph Gn,1/2 [Karp’72] Heuristics for the Hidden Clique Problem
Semi-random (sandwich) model 1. Generate two graphs GLµ GH • Both contain the same k-clique • (Inclusion is with respect to edges) 2. Adversary chooses arbitrary graph G*sandwiched in between GLµ G*µ GH. • Can have less structure (e.g., highly irregular) • In our case: GH = Gn,1/2 + planted clique GL = empty graph + k-clique. • Sandwich model suggested by [Feige-Kilian’98] motivated by [Santha-Vazirani’95, Blum-Spencer’95] semi-random models. Heuristics for the Hidden Clique Problem
Heuristic for sandwich model Plan: • Start with an algorithm for GH • Then show the same algorithms works for G* • Unlike the eigenvalue technique of [AKS] • Additional trick (from [AKS]): • May assume k ¸ cn1/2 for a large constant c. • For a small fixed c>0, • “guess” O(log 1/c) vertices of the clique (by brute force) • and work with (subgraph induced on) their neighbors. Heuristics for the Hidden Clique Problem
The Lovasz theta function (G) • A relaxation for stable set problem • And thus also for max-clique (abusing notation). • (Stable set = induced subgraph with no edges) • Computable in polynomial time • Up to small additive error • By semidefinite programming • Worst-case integrality ratio is n1-o(1) [Feige’95] • On random graph, (Gn,1/2) ' n1/2 [Juhasz’82] Heuristics for the Hidden Clique Problem
Using this relaxation for hidden clique • Lemma. WHP (GH) = k. • Proof idea: • As a relaxation, clearly (GH) ¸ k. • Use a characterization of (G) as a minimization problem (SDP duality): (G) = minM {1(M) : constraints on M} • Make a careful choice of the matrix M to conclude that (GH) · k. Heuristics for the Hidden Clique Problem
Locating the clique • Let v be a vertex of GH. • GH-v is also a random graph + planted clique. • Hence, WHP (GH-v) (GH) iff v belongs to the planted clique • WHP holds for all vertices simultaneously Heuristics for the Hidden Clique Problem
The hidden clique heuristic • The algorithm: • Compute S = { v2V : (GH-v) (GH) } • Check that S forms a clique (Actually one SDP computation suffices.) • Main Theorem. For k=cn1/2 WHP • The algorithm finds the planted clique • And gives a certificate it is a maximum clique. • Similar approach previously used for min-bisection by [Boppana’87]. Heuristics for the Hidden Clique Problem
Sandwich model • Theta function is monotone with respect to addition/removal of edges. • Recal that GLµ G*µ GH. Hence, k ·(GL) ·(G*) ·(GH) · k. • Hence, WHP (G*) = k. • Whenever the proof works for GH it also works for G*, i.e. an adversary cannot affect the algorithm performance. WHP Heuristics for the Hidden Clique Problem
The [Lovasz-Schrijver’91] relaxations • A method for generating stronger relaxations • Works for any 0-1 integer programming relaxation P • Two main flavors: polyhedral N(P) and semidefinite N+(P) • A Lift-and-project method: • Add n2 variables yij (relaxed quadratic constraints xi2=xi) • Then project on the n original variables • Can be applied iteratively • Nn(P) is a tight relaxation (convex hull of integral solutions) • Weak optimization oracle for P implies weak optimization for N(P) and N+(P). • Argument can be iterated any fixed number of times. Heuristics for the Hidden Clique Problem
Application to stable set • Start with naive relaxation for G=(V,E): FR(G):xi+xj· 1 for all ij2E xi¸ 0 for all i2V • Lemma [LS’91]: N+(FR) is at least as strong as the theta function. • N+-rank = least k such that N+k(FR) is tight. • Lemma [LS’91]: N+-rank ·(G). • What is the probable value of N+k(FR)? • Does it lead to improved hidden clique heuristics? Heuristics for the Hidden Clique Problem
The probable value of the [LS] relaxations • Main Theorem. For random graph Gn,1/2 WHP • The value of N+k(FR) is roughly (n/2k)1/2. • And thus the N+-rank is (log n). • Relaxations do not offer heuristic for k=o(n1/2) • They do not seem to distinguish GH from Gn,1/2. • Not better than “guessing” O(1) clique vertices. Heuristics for the Hidden Clique Problem
The upper bound • The [LS’91] proof that N+-rank ·(G) shows: • Each application of N+ is at least as strong as “guessing” one vertex in the stable set. • After k iterations we are left with a random graph G’ on n’ ' n/2k vertices • Hence, N+(FR) is at least as strong as theta function and its value is at most O((n’)1/2). Heuristics for the Hidden Clique Problem
The lower bound • Show iteratively that a “uniform” solution (xi=1/(2kn) 1/2 for all i) is feasible for N+k(FR) • Follows by definition, given that • All degrees in G’ are about their expectation n’/2 • All eigenvalues of G’ are >-n1/2 (as expected) • The base case N+(FR) is similar to the theta function Heuristics for the Hidden Clique Problem
Related Work • [Stephen-Tuncel’99]:N+-rank = (G) for the line graph of the complete graph Kt • (a stable set is a perfect matching in Kt) • [Cook-Dash’01], [Goemans-Tuncel’01]: There exists a polytope P whose N+-rank is n. • (not a stable set relaxation) • Question: When does N+k perform better than “guessing” k variables? • The first iteration is exceptional (introduces semidefiniteness) Heuristics for the Hidden Clique Problem
Open problems • Better Heuristics? • Or evidence it is impossible? • Use in approximation algorithms? • For vertex cover, Nk(FR) has integrality ratio 2-o(1) [Arora-Bollobas-Lovasz’03] • What about other problems? Heuristics for the Hidden Clique Problem