300 likes | 441 Views
The $25 Billion Eigenvector. How does Google do Pagerank ?. The Imaginary Web Surfer:. Starts at any page, Randomly goes to a page linked from the current page, Randomly goes to any web page from a dangling page, … except sometimes (e.g. 15% of the time), goes to a purely random page. J.
E N D
The $25 Billion Eigenvector How does Google do Pagerank?
The Imaginary Web Surfer: • Starts at any page, • Randomly goes to a page linked from the current page, • Randomly goes to any web page from a dangling page, • … except sometimes (e.g. 15% of the time), goes to a purely random page.
J A A tiny web: who should get the highest rank? B I C H D G F E
The associated stochastic matrix: 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.4400 0.0150 0.0150 0.2983 0.4400 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.2983 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.2983 0.8650 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.4400 0.0150 0.0150 0.8650 0.0150 0.8650 0.0150 0.0150 0.0150 0.0150 0.0150 0.2983 0.0150 0.0150 0.8650 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.8650 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.8650 0.2983 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.2983 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.4400 0.0150 0.0150 0.0150
The Imaginary Web Surfer: • Starts at any page, • Randomly goes to a page linked from the current page, • Randomly goes to any web page from a dangling page, • … except sometimes (e.g. 15% of the time), goes to a purely random page.
And the winners are… 'http://www.loc.gov/standards/iso639-2' 'http://www.sil.org/iso639-3' 'http://www.loc.gov/standards/iso639-5' 'http://purl.org/dc/elements/1.1' 'http://purl.org/dc/terms' 'http://purl.org/dc' 'http://creativecommons.org/licenses/by/3.0' 'http://i.creativecommons.org/l/by/3.0/88x31.png' 'http://www.nlb.gov.sg' 'http://purl.org/dcpapers' 'http://www.nl.go.kr' 'http://purl.org/dcregistry' 'http://www.kc.tsukuba.ac.jp/index_en.html'
How much storage to hold this array? • Current estimate of indexed WWW: 4.7 · 1010 web pages • If placed into an array this would have 2.21 · 1021 elements • If each element is stored in 4 bytes, this would be 8.8 · 1022 bytes • Current estimate of world’s data storage capacity is 3.0 · 1018 bytes (.003% of necessary space) http://www.smartplanet.com/blog/thinking-tech/what-is-the-worlds-data-storage-capacity/6256
How much time to do one power step? • Current estimate of indexed WWW: 4.7 · 1010 web pages • If placed into an array this would have 2.21 · 1021 elements • Fastest current machine does 33.86 · 1015 operations per second • One step of y = Ay takes 3.68 days
J A How is xk+1=Axkperformed? B I C H D G F E connection = [2 5 3 4 64 5 6 5 1 10 78 1 8 9] end = [2 5 6 7 8 9 11 12 13 16]
How is xk+1=Axkperformed? • xk+1 = .15/n e, (where e is all 1’s) • start = 1 • for j = 1,…, n • col_tot = endj-start • for i = start,…,endj • ii = connectioni • xk+1ii =xk+1ii+.85/col_tot*xki • c) start =endj+1