80 likes | 121 Views
PageRank. 2. 1. Google : its search listings always seemed deliver the “good stuff” up front. Part of the magic behind it is its PageRank Algorithm. PageRank ™ algorithm, developed by Google’s founders, Larry Page and Sergey Brin , when they were graduate students at Stanford University.
E N D
PageRank 2 1 • Google : its search listings always seemed deliver the “good stuff” up front. • Part of the magic behind it is its PageRank Algorithm • PageRank™ algorithm, developed by Google’s founders, Larry Page and Sergey Brin, when they were graduate students at Stanford University.
PageRank (Basic Idea) Importance Score: rating for webpage importance Importance score Suppose the web of interest contains n pages Each page indexed by an integer k Example: 3 The importance score of page k on the web 1 Indicate that page m is more important than page j 2 4
PageRank (Basic Idea) Simple approach: Number of links 3 1 This approach ignore : a link from an important page 2 4 Do they have same importance. Page 1 has link from page 3 (Page 3 has the maximum score)
PageRank (Basic Idea) Let’s compute the score of page j as the sum of the scores of all pages linking to page j. 3 1 Just as in election: Do they have same importance. Page 1 has link from page 3 (Page 3 has the maximum score) 2 4 • a link to page k becomes a vote for page k’s importance • we don’t want a single indivisiual to gain influence merely by casting multiple votes
PageRank (Basic Idea) Just as in election: • a link to page k becomes a vote for page k’s importance • we don’t want a single indivisiual to gain influence merely by casting multiple votes 3 1 2 4
PageRank (Basic Idea) Problem: In linear algebra language: Find an eigenvector for a matrix A associated with eigenvalue 3 1 Note that Page 3 is not the most important page 2 4
PageRank (Basic Idea) In Numerical linear algebra language: Find an efficient computational algorithm to compute eigenvectors Difficulties & Features Google's PageRank is an eigenvector of a matrix of order 2.7 billion (May 2002) 3 1 (a google blog post claimed in 2008) It is recomputed about once a month and does not involve any of the actual content of Web pages or of any individual query. 2 4 There are more than one linearly independent eigenvectors
PageRank (Basic Idea) In Numerical linear algebra language: Find an efficient computational algorithm to compute eigenvectors Difficulties & Features The matrix A is sparse (tons of zeros) The matrix A. Its elements are all strictly between zero and one and its column sums are all equal to one. (Markov chain) 3 1 one way to compute the eigenvector x would be to start with a good approximate solution, such as the PageRanks from the previous month, and simply repeat the assignment statement (In Numerical: Continuation Method) 2 4