1 / 21

SimRank : A Measure of Structural-Context Similarity

SimRank : A Measure of Structural-Context Similarity. Advisor : Dr. Hsu Graduate : Sheng-Hsuan Wang Author : Glen Jeh Jennifer Widom. Outline. Motivation Objective Introduction Basic Graph Model SimRank Random Surfer-Pairs Model Future Work Personal opinion.

rocco
Download Presentation

SimRank : A Measure of Structural-Context Similarity

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SimRank : A Measure of Structural-Context Similarity Advisor : Dr. Hsu Graduate : Sheng-Hsuan Wang Author : Glen Jeh Jennifer Widom

  2. Outline • Motivation • Objective • Introduction • Basic Graph Model • SimRank • Random Surfer-Pairs Model • Future Work • Personal opinion

  3. Motivation • The problem of measuring “similarity” of objects arises in many applications.

  4. Objective • The approach, applicable in any domain with object-to-object relationships. • Two objects are similar if they are related to similar objects.

  5. Introduction

  6. Basic Graph Model • We model objects and relationships as a directed graph G=(V,E). • For a node v in a graph, we denote by I(v) and O(v) the set of in-neighbors and out-neighbors.

  7. (1) SimRank • Basic SimRank Equation • If a=b then s(a,b) is defined to be 1. Otherwise, • Where C is a constant between 0 and 1. • Set s(a,b)=0 when or .

  8. SimRank • Bipartite SimRank • Two types of objects. • Example : Shopping graph G.

  9. SimRank

  10. (2) (3) SimRank • Let s(A,B) denote the similarity between persons A and B, for • Let s(c,d) denote the similarity between items c and d, for

  11. (if ) (if ) (4) For , and for . SimRank • Computing SimRank-Naive Method • is a lower bound on the . • To compute from

  12. SimRank • The space required is simply to store the results . • The time required is . • K:The number of iterations • :The average of |I(a)||I(b)| over all node pairs (a,b).

  13. SimRank • Computing SimRank-Pruning • set the similarity between two nodes far apart to be 0. • consider node-pairs only for nodes which are near each other.

  14. SimRank • Radius r, and average such neighbors for a node, then there will be node-pairs. • The time and space complexities become and respectively.

  15. (5) Random Surfer-Pair Model • Expected Distance • Let H be any strongly connected graph. • Let u,v be any two nodes in H. • We define the expected distance d(u,v) from u to v as

  16. (6) Random Surfer-Pair Model • Expected Meeting Distance(EMD).

  17. (7) Random Surfer-Pair Model • Expected-f Meeting Distance • To circumvent the “infinite EMD” problem. • To map all distances to a finite interval. • Exponential function ,where is a constant.

  18. Random Surfer-Pair Model • Equivalence to SimRank

  19. Random Surfer-Pair Model • Theorem. • The SimRank score, with parameter C, between two nodes is their expected-f meeting distance traveling back-edges, for .

  20. Future Work • Future Work. • Divided and conquer and merge. • Divided a corpus into chunks… • Ternary(or more) relationships.

  21. Personal Opinion • We believe that the intuition behind SimRank can be used in many domains which based on objects to objects.

More Related