260 likes | 409 Views
Graph Modeled Data Clustering: Fixed Parameter Algorithms for Clique Generation. J. Gramm, J. Guo, F. Hüffner and R. Niedermeier Theory of Computing Systems (2005) Student: Vishal Kapoor. Presentation Outline. Problem Introduction Past Research Results of the paper CLUSTER EDITING
E N D
Graph Modeled Data Clustering: Fixed Parameter Algorithms for Clique Generation J. Gramm, J. Guo, F. Hüffner and R. Niedermeier Theory of Computing Systems (2005) Student: Vishal Kapoor
Presentation Outline • Problem Introduction • Past Research • Results of the paper • CLUSTER EDITING • Kernelization • Search Tree • CLUSTER DELETION • Questions
Problem Statement • Make k changes to the edge set of an input graph to get vertex disjoint cliques. • Each connected component is a clique in the resulting cluster graph • CLUSTER EDITING • Both edge additions and deletions are allowed • CLUSTER DELETION • Only edge deletions are allowed • Used in clustering of data – vertices are adjacent iff their similarity exceeds a threshold
Past Research • [2000] Study of both these problems started by Shamir et. al. who proved that they are NPC and APX-hard • [1996] Cai studied the problem of edge additions and deletions and vertex deletions for certain graphs and showed it is FPT • [2001] Natanzon et. al. gave a general c-approximation for deletion and editing problems on bounded degree graphs for graphs with certain properties • [2002] Khot and Raman investigated the complexity of vertex deletion problems to find subgraphs with hereditary properties
Results of this paper • CLUSTER EDITING – O(2.27k+|V|3) • CLUSTER DELETION – O(1.77k+|V|3) • By using certain reduction rules, the resulting kernel size = O(k3) • Has at most 2k2+ 2 vertices and 2k3+k2 edges.
CLUSTER EDITING common neighbor non-common neighbor v u
Reduction Rules • Rule1: • If u and v have more than k common neighbors then {u,v} is set to ADDED and added to E if not already there • If u and v have more than k non-common neighbors then {u,v} is set to DELETED and deleted from E if already there • If u and v have both more than k common neighbors and more than k non-common neighbors then the instance has no solution
Reduction Rules • Rule2: • For every 3 vertices u, v and w: • If {u,v} = ADDED and {u,w} = ADDED then {v,w} should be set to ADDED and added if not already in E • If {u,v} = ADDED and {u,w} = DELETED then {v,w} should be set to DELETED and deleted from E if already present
Running Time • What is checked? • Every pair of vertices • Every vertex which is a neighbor of both of them • Takes time O(|V|3)
Kernel Size • The kernel contains at most (2k+1).k vertices and at most (2k+1 choose 2).k edges. • Proof Skipped
Branch and Search Algorithm • Identify a bad triple (of 3 vertices) in the kernel and repair it by adding/deleting edges to/from it, to transform the graph into disjoint cliques • Overall at most k edge additions/deletions are allowed • 2 branching strategies: • Basic = O(3k) • Advanced = O(2.27k)
u v w Basic Branching • Lemma: A graph consists of disjoint cliques iff there are no three vertices u,v,w such that {u,v}, {u,w} are edges, but {v,w} is not an edge • i.e. among such a triple, there should either be a single edge or a triangle • Thus if a graph is not a union of disjoint cliques, then a bad triple can be found and repaired
Basic Branch Algorithm • If G is a union of disjoint cliques, return SUCCESS • If k <= 0, return FAIL • Otherwise, find 3 vertices u,v,w such that edges {u,v}, {u,w} exist and {v,w} does not and branch on 3 instances of G’ as follows: • E’ = E – {u,v}, k’=k-1 and set {u,v}=DELETED • E’ = E – {u,w}, k’=k-1 and set {u,w} and {v,w}=DELETED, {u,v}=ADDED • E’ = E + {v,w}, k’=k-1 and set all edges=ADDED
Branching Rules u v w u u ? ? v w v w BR3 u BR1 v w BR2
Running time The algorithm solves CLUSTER EDITING in time = O(3k.k2+|V|3) • O(|V|3) is the time required to find all bad triples • O(3k) is the size of the search tree • The kernel (modified input G’) has |V| = O(k2) vertices. So a newly added/deleted edge can create/delete at most O(k2) bad triples. [And the edge list can then be updated only for vertices affected by that edge in O(k2) time.]
Eg. NOTE: The time can be improved to O(3k+|V|3) by using repeated kernelization at every search tree node whenever possible for a polynomial size problem kernel • Similarly CLUSTER-DELETION can be solved in time = O(2k+|V|3)
u w u v u w v w v Advanced Branch Algorithm • Bad triples are considered, but their classification is refined further as follows: C2 C1 C3
u w v u2 u1 v2 v1 w2 w1 C1 Branching for each case • For C1: BR3 cannot give a solution better than both BR1 and BR2 and can be omitted • If N(v) >= N(w), then total edges changed to make 1 clique >= total edges changed to make 2 cliques
u w v u2 u1 v2 v1 w2 w1 C1 • Edges added to make 1 clique = • {v,w} added = +1 • {v,N(w)} added – {u,v} existing = N(v) – 1 • {w,N(v)} added – {u,w} existing = N(w) – 1 • joining all N(w) and N(v) = ([N(w)+N(v)] choose 2) • joining each N(v) and N(w) with u = N(v)+N(w) • Total = 2.[N(v) + N(w)] + ([N(w)+N(v)] choose 2) – 1 =>(A) • Edges changed to make 2 cliques = • N(w) deleted = N(w) • {v,N(w)} added – {u,v} existing = N(v) – 1 • joining all N(w) and N(v) = ([N(w)+N(v)] choose 2) • joining each N(v) and N(w) with u = N(v)+N(w) • Total = N(v) + 3.N(w) + ([N(w)+N(v)] choose 2) – 1 =>(B) • Conclusion: As N(v) >= N(w) So (A) >= (B).
u u ? ? v w v w BR2 BR1 • Thus only BR1 and BR2 can be used: • So resulting graphs = G\{u,v} or G\{u,w} and branching vector = (1,1) • And final recurrence relation: T(k) = 2.T(k-1) with root = 2. • So final tree size for C1 = 2k.
For C2: • Branching Vector = (1,2,3,2,3)
For C3: • Branching Vector = (1,2,3,2,3)
Overall Running Time • Solve T(k) = T(k-1) + 2 [T(k-2) + T(k-3)] • So final worst search tree size = O(2.27k) • Thus CLUSTER-EDITING can be solved in O(2.27k+|V|3)
Cases for CLUSTER-DELETION: • Branching Vector = (2,3,2,3) and running time = O(1.77k + |V|3)
Questions? Thanks.