110 likes | 355 Views
Part I: Web Structure Mining Chapter 2: Hyperlink Based Ranking. Social Network Analysis PageRank Authorities and Hubs Link Based Similarity Search Enhanced Techniques for Page Ranking. Social Networks. Directed graph with weights assigned to its edges
E N D
Part I: Web Structure MiningChapter 2: Hyperlink Based Ranking • Social Network Analysis • PageRank • Authorities and Hubs • Link Based Similarity Search • Enhanced Techniques for Page Ranking Zdravko Markov and Daniel T. Larose, Data Mining the Web: Uncovering Patterns in Web Content, Structure, and Usage, Wiley, 2007. Slides for Chapter 1: Information Retrieval an Web Search
Social Networks • Directed graph with weights assigned to its edges • Nodes represent documents and the edges – citations from one document to other documents. • Prestige can be associated with the number of input edges to a node (in-degree). • Prestige has a recursive nature. depends on the authority (or again, the prestige) of citations Zdravko Markov and Daniel T. Larose, Data Mining the Web: Uncovering Patterns in Web Content, Structure, and Usage, Wiley, 2007. Slides for Chapter 1: Information Retrieval an Web Search
Social Networks • adjacency matrix • if document cites document • otherwise • prestige score Zdravko Markov and Daniel T. Larose, Data Mining the Web: Uncovering Patterns in Web Content, Structure, and Usage, Wiley, 2007. Slides for Chapter 1: Information Retrieval an Web Search
Social Networks • Computing prestige • Eigen decomposition • EigenvectorP • Eigenvalue Zdravko Markov and Daniel T. Larose, Data Mining the Web: Uncovering Patterns in Web Content, Structure, and Usage, Wiley, 2007. Slides for Chapter 1: Information Retrieval an Web Search
Social Networks Zdravko Markov and Daniel T. Larose, Data Mining the Web: Uncovering Patterns in Web Content, Structure, and Usage, Wiley, 2007. Slides for Chapter 1: Information Retrieval an Web Search
Social Networks Power Iteration • Loop: • While Zdravko Markov and Daniel T. Larose, Data Mining the Web: Uncovering Patterns in Web Content, Structure, and Usage, Wiley, 2007. Slides for Chapter 1: Information Retrieval an Web Search
PageRank • “Random web surfer”keeps clicking on hyperlinks at random with uniform probability • Implements random walk on the web graph • Page u links to web pages • Probability of visiting page v will be • Amount of prestige that page v receives from page u is of the prestige of u Zdravko Markov and Daniel T. Larose, Data Mining the Web: Uncovering Patterns in Web Content, Structure, and Usage, Wiley, 2007. Slides for Chapter 1: Information Retrieval an Web Search
PageRank Propagation of page rank Zdravko Markov and Daniel T. Larose, Data Mining the Web: Uncovering Patterns in Web Content, Structure, and Usage, Wiley, 2007. Slides for Chapter 1: Information Retrieval an Web Search
PageRank Calculation of page rank Norm Norm Integers Zdravko Markov and Daniel T. Larose, Data Mining the Web: Uncovering Patterns in Web Content, Structure, and Usage, Wiley, 2007. Slides for Chapter 1: Information Retrieval an Web Search