540 likes | 667 Views
Using Adaptive Methods for Updating/Downdating PageRank. Gene H. Golub Stanford University SCCM Joint Work With Sep Kamvar, Taher Haveliwala. Motivation. Problem: Compute PageRank after the Web has changed slightly Motivation: “Freshness”.
E N D
Using Adaptive Methods for Updating/Downdating PageRank Gene H. Golub Stanford University SCCM Joint Work With Sep Kamvar, Taher Haveliwala
Motivation • Problem: • Compute PageRank after the Web has changed slightly • Motivation: • “Freshness” Note: Since the web is growing, PageRank Computations don’t get faster as computers do.
0.4 0.4 Power Method: 0.2 Outline • Definition of PageRank • Computation of PageRank • Convergence Properties • Outline of Our Approach • Empirical Results
Link Counts Gene’s Home Page Martin’s Home Page Donald Rumsfeld George W. Bush Iain Duff’s Home Page Yahoo! Linked by 2 Unimportant pages Linked by 2 Important Pages
importance of page i importance of page j number of outlinks from page j pages j that link to page i Definition of PageRank • The importance of a page is given by the importance of the pages that link to it.
1/2 1/2 1 1 0.05 0.25 0.1 0.1 0.1 Definition of PageRank Gene Martin SCCM Yahoo! Duff
PageRank Diagram 0.333 0.333 0.333 Initialize all nodes to rank
PageRank Diagram 0.167 0.333 0.333 0.167 Propagate ranks across links (multiplying by link weights)
PageRank Diagram 0.5 0.333 0.167
PageRank Diagram 0.167 0.5 0.167 0.167
PageRank Diagram 0.333 0.5 0.167
PageRank Diagram 0.4 0.4 0.2 After a while…
.1 .3 .2 .3 .1 .1 .1 .3 .2 .3 .1 .1 = 0 .2 0 .3 0 0 .1 .4 0 .1 .2 Matrix Notation
.1 .3 .2 .3 .1 .1 .1 .3 .2 .3 .1 .1 0 .2 0 .3 0 0 .1 .4 0 .1 = .2 Matrix Notation Find x that satisfies:
Eigenvalue Distribution • The matrix PT has several eigenvalues on the unit circle. This will make power method-like algorithms less effective.
Rank-1 Correction • PageRank doesn’t actually use PT. Instead, it uses A=cPT + (1-c)ET. • E is a rank 1 matrix, and in general, c=0.85. • This ensures a unique solution and fast convergence. • For matrix A, l2=c. 1 1From “The Second Eigenvalue of the Google Matrix” (http://dbpubs.stanford.edu/pub/2003-20)
0.4 0.4 Repeat: 0.2 u1 u1 u2 u2 u3 u3 u4 u4 u5 u5 Outline • Definition of PageRank • Computation of PageRank • Convergence Properties • Outline of Our Approach • Empirical Results
Power Method • Initialize: • Repeat until convergence:
Power Method Express x(0) in terms of eigenvectors of A u1 1 u2 a2 u3 a3 u4 a4 u5 a5
Power Method u1 1 u2 a22 u3 a33 u4 a44 u5 a55
Power Method u1 1 u2 a222 u3 a332 u4 a442 u5 a552
Power Method u1 1 u2 a22k u3 a33k u4 a44k u5 a55k
Power Method u1 1 u2 0 u3 0 u4 0 u5 0
Then, you can write any n-dimensional vector as a linear combination of the eigenvectors of A. u1 1 u2 a2 u3 a3 u4 a4 u5 a5 Why does it work? • Imagine our n x n matrix A has n distinct eigenvectors ui.
All less than 1 Why does it work? • From the last slide: • To get the first iterate, multiply x(0) by A. • First eigenvalue is 1. • Therefore:
u1 1 u2 a22 u3 a33 u4 a44 u5 a55 u1 1 u2 a222 u3 a332 u4 a442 u5 a552 Power Method u1 1 u2 a2 u3 a3 u4 a4 u5 a5
0.4 0.4 Repeat: 0.2 u1 u1 u2 u2 u3 u3 u4 u4 u5 u5 Outline • Definition of PageRank • Computation of PageRank • Convergence Properties • Outline of Our Approach • Empirical Results
Convergence • The smaller l2, the faster the convergence of the Power Method. u1 1 u2 a22k u3 a33k u4 a44k u5 a55k
Quadratic Extrapolation (Joint work with Kamvar and Haveliwala) Estimate components of current iteratein the directions of second two eigenvectors, and eliminate them. u1 u2 u3 u4 u5
Facts that work in our favor • For traditional problems: • A is smaller, often dense. • l2 often close to l1, making the power method slow. • In our problem, • A is huge and sparse • More importantly, l2 is small1. 1(“The Second Eigenvalue of the Google Matrix” dbpubs.stanford.edu/pub/2003-20.)
How do we do this? • Assume x(k) can be written as a linear combination of the first three eigenvectors (u1, u2, u3) of A. • Compute approximation to {u2,u3}, and subtract it from x(k) to get x(k)’
Sequence Extrapolation • A classical and important field in numerical analysis: techniques for accelerating the convergence of slowly convergent infinite series and integrals.
Example: Aitken Δ2 - Process Suppose A=An+aλn+rn where rn=bμn+o(min{1,|μ|n}), a, b, λ, μ all nonzero, |λ|>|μ|. It can be shown that Sn = (AnAn+2–An+12)/(An-2An+1+An+2) satisfies (as n goes to infinity) | Sn-A| --------- O( (|μ|/|λ|)n = o(1). |An-A| ….
In other words… Assuming a certain pattern for the series is helpful in accelerating convergence. We can apply this component-wise in order to get a better estimate of the eigenvector.
Another approach • Assume the x(k) can be represented by three eigenvectors of A:
Linear Combination • We take some linear combination of these 3 iterates.
Rearranging Terms • We can rearrange the terms to get: Goal: Find b1,b2,b3 so that coefficients of u2 and u3are 0, and coefficient of u1is 1.
Rearranging Terms • We can rearrange the terms to get: Goal: Find b1,b2,b3 so that coefficients of u2 and u3are 0, and coefficient of u1is 1.
Results Quadratic Extrapolation speeds up convergence. Extrapolation was only used 5 times.
Estimating the coefficients Procedure 1: Set ß1=1 and solve the least squares problem. Procedure 2: Use the SVD for computing the coefficient of the characteristic polynomial.
Results Extrapolation dramatically speeds up convergence, for high values of c (c=.99)
Take-home message • Quadratic Extrapolation estimates the components of current iterate in the direction of the second and third eigenvector, and subtracts them off. • Achieves significant speedup, and ideas are useful for further speedup algorithms.
Summary of this part • We make an assumption about the current iterate. • Solve for dominant eigenvector as a linear combination of the next three iterates. • We use a few iterations of the Power Method to “clean it up”.
0.4 0.4 0.2 Outline • Definition of PageRank • Computation of PageRank • Convergence Properties • Outline of Our Approach • Empirical Results Power Method:
Basic Idea • When a the PageRank of a page has converged, stop recomputing it.
Updates • Use the previous vector as a start vector. • Speedup not that great. • Why? The old pages converge quickly, but the new pages still take long to converge. • But, if you use Adaptive PageRank, you save the computation on the old pages.
0.4 0.4 Repeat: 0.2 Outline • Definition of PageRank • Computation of PageRank • Convergence Properties • Outline of Our Approach • Empirical Results