150 likes | 373 Views
Extrapolation Methods for Accelerating PageRank Computations. Doğu Gül Boğaziçi University 1/12/2003. Introduction. Fast computation method for PageRank which is a hyperlink-based estimate of the “importance” of Web pages, is proposed. Web link graph is represented by a “Markov matrix”.
E N D
Extrapolation Methods for Accelerating PageRank Computations Doğu Gül Boğaziçi University 1/12/2003
Introduction • Fast computation method for PageRank which is a hyperlink-based estimate of the “importance” of Web pages, is proposed. • Web link graph is represented by a “Markov matrix”. • The PageRank algorithm uses the “Power Method” to compute the Markov matrix. • Empirically, it is shown that extrapolation methods speed up PageRank computation by 25-300%.
Definitions • A link from a page u to a page v can be viewed as an evidence that v is an “important” page. • The amount of importance of page v which has a link from a page u, is proportional to the importance of u and inversely proportional to the number of pages u points to. • The PageRank of a page i is defined as the probability that at some particular time step, the surfer is at page i.
Adopting Markov Matrix • The problem can be defined as a random walk on a directed Web graph. • Assume there exists an edge from u to v. • Deg(u) is the outdegree of page u in a Web graph G. • Consider a random surfer visiting page u at time k, in the next time step, the surfer chooses a node vi from among u’s out-neighbors uniformly at random. • The transition matrix describing the transition from i to j is given by P with Pij = 1 / deg(i).
Conversion to a Valid Transition Matrix • For P to be a valid transition matrix, P should have no rows with consisting of all zeros. • A new transition matrix P’ is introduced which has no rows existing with all zeros. • Let d be the n-dimensional column vector identifying the nodes with outdegree 0:
Conversion to a Valid Transition Matrix (cont.) • Then P’ is constructed as follows: • P’’ is constructed as follows:
Power Method • The A that is equal to (P’’)T, is used in the formulations of “Power Method”. x(k) = A(k).x(k-1) • x(0) can be written as follows: x(0) = u1 + α2u2 + ..... + αmum
Power Method Algorithm • The power method algorithm: PowerMethod(){ x(0) = v k = 1 repeat x(k) = Ax(k-1) a = |x(k) – x(k-1)| k = k + 1 until a < ε }
Aitken Extrapolation • x(k-2) can be expressed as a linear combination of the first two eigenvectors. • x(k-2) = u1 + α2u2 • x(k-1) = A x(k-2) • x(k) = A x(k-1)
Aitken Extrapolation Results • Comparison of convergence rate of unaccelerated Power Method and Aitken Extrapolation for c = 0.99. • Extrapolation was applied at the 10th iteration.
Quadratic Extrapolation • It is assumed that Markov matrix A has only three eigenvectors and x(k-3) can be expressed as a linear combination of these three eigenvectors. • x(k-2) = u1 + α2u2 + α3u3 • x(k-2) = A x(k-3) • x(k-1) = A x(k-2) • x(k) = A x(k-1)
Quadratic Extrapolation Results • Comparison of convergence rates for Power Method and Quadratic Extrapolation on LARGEWEB for c = 0.90.
Quadratic Extrapolation Results • Comparison of times taken by Power Method and Quadratic Extrapolation on LARGEWEB for c = {0.90, 0.95, 0.99} • The residual tolerance is set to 0.001 for c = {0.90, 0.95} and 0.01 for c = 0.99.
Comparison of Convergence Rates for Three Methods • Comparison of convergence rates for Power Method, Aitken Extrapolation and Quadratic Extrapolation for c = 0.99.
Conclusion • Although PageRank is an offline computation, it has become increasingly desirable to speed up this computation. • The extrapolation step need only be applied periodically not at all steps. • Quadratic and Aitken extrapolation is a simple technique that requires little additional infrastructure to integrate into the standard Power Method.