Greedy Approximation Algorithms for finding Dense Components in a Graph

Greedy Approximation Algorithms for finding Dense Components in a Graph Paper by Moses Charikar Presentation by Paul Horn

Overview • Differing definitions of density • The problem • Undirected Case • Linear Programming • Network Flows • Approximation • Directed Case • Linear Programming • Approximation

Defining Density • Logical definition of density relates the number of edges to the number of possible edges. In other words, given G(V,E)

Problems with Density • This simple definition of density does not make sense when looking for a densest subgraph, as two vertices connected by an edge have density 1, and this problem simplifies to maximum clique

Redefining Density • Instead we define density as the average degree of a subgraph. • This definition of density is appropriate for sparse graphs • This definition is, however, inappropriate for Erdős-Rényi random graphs.

Density of a Directed Graph • Introduced by Kannan and Vinay Given a digraph G(V,E), consider subgraphs S, T and let E(S,T) be the set of directed edges from S to T. Then the density of the sets S and T is The density of the graph G is

The problem • Known exact algorithms for finding a maximum density subgraph of a graph are cubic or slower. • For large graphs, such as the webgraph – or even any sizable chunk of the webgraph this is too slow.

Linear programming • In an undirected case an exactly solution can be solved by maximizing the following LP.

Go with the flow? • Flow-based algorithm to find a maximum density subgraph exists. • Finding a Maximum Density Subgraph, by A.V. Goldberg • Creates a digraph from the undirected graph, and uses flows to partion the graph. • Requires log(n) executions of a max flow algorithm

Getting Greedy… • Since the density of a subgraph S is its average degree, nodes of lowest degree are least likely to be a part of the densest subgraph. • Algorithm: Remove the lowest degree vertex each time, find the maximum density subgraph. • Runs in O(|V|) time. • Theorem: Algorithm is a 2-approximation of f(S)

Directing our Insight • Finding the maximum d(S,T) is harder as we need to find the maximum over all subgraphs S and T. • For our exact case, we can generalize our LP to use |S|/|T| = c as a parameter to give us our new LP(c) • Can be solved in O(n2) linear programs

LP(c) LP(c) A solution to this linear program corresponds to the densests sets S, T such that |S|/|T| = c for a given value of c. Therefore

Approximate this. • Idea: Maintain two sets, S and T. At each iteration remove either the vertex of the lowest ‘degree’ in S or T based on a certain rule. • We define degree of a vertex x in S to be |E({x}, T)| and degree of a vertex y in T to be |E(S,{y})|. • Our rule is based on the same idea of c=|S|/|T| that we found in the linear progam, so each pass finds an S and T that maximize for that particular c.

Analyzing our Approximation • When run over all c values, this algorithm gives us a 2 approximation of d(c). • There are, however, roughly n2possible values of c. • Each iteration can run in O(m+n) time. • Therefore running through all possible values becomes restrictive. • Anis possible in iterations of the algorithm.

Generalizations, and notes • While there is a flow-based algorithm for finding a maximum density subgraph of an undirected graph, none is known for a digraph. • Both cases can be generalized to weighted graphs, however the linear nature of the algorithm does not hold. • Using Fibonacci heaps it can run in O(m+nlogn). (in the directed case, for a single value of c.)

Wrapping Up • Finding dense subgraphs is important in areas such as clustering. • Kannan and Vinay defintion of density motivated by the idea of hubs and authorities. • With large graphs (such as any sizable chunk of the webgraph), solving the n2LP to find the exact densest graph is unrealistic

Wrapping Up: The Sequel • Therefore, the paper • Provides LP solutions to both the directed and undirected cases • Provides a linear approximation algorithm for undirected graph techniques • Generalizes the algorithm to directed graphs, finding sets S and T given |S|/|T|=c. • Observes that this is a 2-aproximation when run over all values of c and a aproximation is possible in iterations.

Future Work • Flow based algorithm for directed case. • The defintion of density which we used does not require S and T to be disjoint. How does this requirement affect the algorithm and it’s complexity? • An n-approximation of d(G) can provide an O(n)-approximation of d’(G)

Greedy Approximation Algorithms for finding Dense Components in a Graph

Greedy Approximation Algorithms for finding Dense Components in a Graph

Presentation Transcript

Models of Greedy Algorithms for Graph Problems

Greedy Algorithms

Greedy Algorithms

Graph Optimization Problems and Greedy Algorithms

Greedy Algorithms

Greedy Algorithms

Analysis of Algorithms Chapter - 06 Greedy Graph Algorithms

Greedy algorithms

Greedy Algorithms

Greedy Algorithms

Approximation Algorithms for Graph Routing Problems

Greedy Algorithms

Greedy Algorithms

Greedy Algorithms

Greedy Algorithms

Greedy Algorithms

Greedy Algorithms

Greedy Algorithms

Greedy Algorithms

Greedy Approximation Algorithms for Covering Problems in Computational Biology

Greedy Algorithms

Greedy Algorithms