200 likes | 368 Views
Hyper search ing the Web Soumen Chakrabarti, Byron Dom, S. Ravi Kumar, Prabhakar Raghavan, Sridhar Rajagopalan, Andrew Tomkins. Jacob Kalakal Joseph CS 572 (Spring 2011) | Class Presentation | June 21, 2011. Outline. Characteristics of the WWW Motivation for building search engines
E N D
Hypersearching the WebSoumen Chakrabarti, Byron Dom, S. Ravi Kumar,Prabhakar Raghavan, Sridhar Rajagopalan, Andrew Tomkins Jacob Kalakal Joseph CS 572 (Spring 2011) | Class Presentation | June 21, 2011
Outline Characteristics of the WWW Motivation for building search engines Traditional SEs and the challenges Improvements the associated problems CLEVER Power of hyperlinks Hubs and Authorities Algorithm Evaluate CLEVER Future scope Answer questions and class discussion CS572-Joseph
WWW ~ Universe CS572-Joseph
Motivation for search engines CS572-Joseph
Initial Attempts Ranking functions based on simple heuristics CS572-Joseph
Challenges: Synonymy CS572-Joseph
Challenges: Polysemy CS572-Joseph
Challenges: Spamming Cheap airtickets Cheap airtickets Cheap airtickets Cheap airtickets Cheap airtickets White font on White background CS572-Joseph
Improvements Semantic Networks Human selectors Impractical Helps synonymy but worsens polysemy CS572-Joseph
Hyperlinks - What a CLEVER idea! CS572-Joseph
Hubs & Authorities CS572-Joseph
How it works CS572-Joseph
Clever vs. Google Google’s faster! Clever looks back also CS572-Joseph
Pros Rapid convergence (5 iterations for root set of 3000 pages) Independent of the initial H, A scores Get info even before we actually crawl CS572-Joseph
Segregation of web into clusters CS572-Joseph
Cons • The underlying assumption – “Web links confer authority” – could be incorrect! • Navigation • Advertisement • Disapproval CS572-Joseph
Cons Ignores the Anchor text It is not necessary for every page to be either a hub or an authority Universally popular Websites like Wikipedia will be an authority on almost everything May return a General result for a Narrow topic search CS572-Joseph
What’s next? CS572-Joseph
References • S. Chakrabarti, B. Dom, D. Gibson, J. Kleinberg, S.R. Kumar, P. Raghavan, S. Rajagopalan, A. Tomkins,Hypersearching the Web. Scientific American, June 1999. • CLEVER project (http://www.almaden.ibm.com/projects/clever.shtml) • J. Kleinberg.Authoritative sources in a hyperlinked environment. Proc. 9th ACM-SIAM Symposium on Discrete Algorithms, 1998 • S. Brin, L. Page. The anatomy of a large-scale hypertextual Web search engine. Computer Networks and ISDN Systems. Vol. 30, No. 1-7, pp. 107-117, 1998. • WordNet Project (http://wordnet.princeton.edu/) CS572-Joseph
Group Discussion CS572-Joseph