1 / 14

HITS ( Hyper Text induced Topic Search)

Brief introduction to HITS Algorithm

faisalriaz
Download Presentation

HITS ( Hyper Text induced Topic Search)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Hypertext-Induced Topic Search

  2. Hypertext-Induced Topic Search • Algorithm developed by Kleinberg in 1998. • Attempts to computationally determine hubs and authorities on a particular topic through analysis of a relevant sub-graph of the web. • This is the mutually reinforcing relationship. • Hubs point to lots of authorities. • Authorities are pointed to by lots of hubs.

  3. Key Terms • HUBS • Authority • In degree • Outdegree

  4. Hubs & Authorities •Good hub: page that points to many good authorities. •Good authority: page pointed to by many good hubs. • Out-degree of p: number of nodes it has links to. • In-degree of p: number of nodes that have links to it. •Given Keyword Query, assign a hub and an authoritative value to each page. •Pages with high authority are results of query

  5. Computing Hubs and authorities • Computes hubs and authorities for a particular topic specified by a normal query. • First determines a set of relevant pages for the query called the base set S. • Analyze the link structure of the web subgraph defined by S to find authority and hub pages in this set.

  6. Constructing a Base Subgraph •For a specific query Q, let the set of documents returned by a standard search engine be called the root set R. •Initialize S to R. •Add to S all pages pointed to by any page in R. •Add to S all pages that point to any page in R.

  7. Adjacency Matrix • Graphs are represented by Adjacency matrix. • Adjacency matrix is a square matrix used to represent a finite graph. • Elements of a matrix indicate whether the pair of vertices are adjacent or not to graph. • Adjacent points are represented with “1” and not adjacent with “0”.

  8. Graph from Adjacency Matrix • Adjacency Matrix • Node Graph N2 N1 N4 N3

  9. HUB and Authority • Outdegree • Indegree method • Hub(Rank) = N1,{N2,N3} Tie, N4 • Authority(Rank)=N4,N3,{N1,N2} Tie

  10. Hub and Authority using Matrix • A = Adjacency Matrix • u = Hub (Weight Vector) • v = Authority (Vector) • v = • u = A * v • Initial Rank of the hubs are unknown so assigning equal ranks to all hubs. • u =

  11. Hub and Authority using Matrix • v = • v = Note here we are using K=1 Where K is the number of iterations. • u = A * v • u = • u =

  12. Hub and Authority using Matrix • Matrix Method • Hub(Rank) = N1,N2,N3,N4 • Authority(Rank)=N4,N3,{N1,N2} Tie

  13. Hub and Authority using Matrix • Let us calculate Ranks for Hubs and Authorities for K=2 • Divide each element in u and v matrix with square root of sum of squares of each elements in uand vobtained in previous iteration. • = • = • v = , u = Note that there is no change in the ranks and authority

  14. PageRank v.s. Authorities PageRank (Google) HITS (CLEVER) performed on the set of retrieved web pages for each query computes authorities and hubs easy to compute, but real-time execution is hard • computed for all web pages stored in the database prior to the query • computes authorities only • Trivial and fast to compute

More Related