1 / 11

R-MAT: A Recursive Model for Graph Mining

R-MAT: A Recursive Model for Graph Mining. Deepayan Chakrabarti Yiping Zhan Christos Faloutsos. Introduction. Protein Interactions [genomebiology.com]. Internet Map [lumeta.com]. Food Web [Martinez ’91]. Graphs are ubiquitous “Patterns”  regularities that occur in many graphs

mdube
Download Presentation

R-MAT: A Recursive Model for Graph Mining

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. R-MAT: A Recursive Model for Graph Mining Deepayan Chakrabarti Yiping Zhan Christos Faloutsos

  2. Introduction Protein Interactions [genomebiology.com] Internet Map [lumeta.com] Food Web [Martinez ’91] • Graphs are ubiquitous • “Patterns”  regularities that occur in many graphs • We want a realistic and efficient graph generator • which matches many patterns • and would be very useful for simulation studies.

  3. Effective Diameter Hop-plot Count vs Outdegree Count vs Indegree Graph Patterns Power Laws “Network values” vs Rank Count vs Stress Eigenvalue vs Rank

  4. Our Proposed Generator Choose quadrant b a b a=0.4 b=0.15 c d c=0.15 d=0.3 Initially Choose quadrant c and so on ….. Final cell chosen, “drop” an edge here.

  5. Our Proposed Generator Communities b RedHat a Communities within communities Linux guys b c d Mandrake Windows guys c d Cross-community links Shows a “community” effect

  6. Effective Diameter Experiments (Epinions directed graph) Count vs Indegree Count vs Outdegree Hop-plot Count vs Stress Eigenvalue vs Rank “Network value” ►R-MAT matches directed graphs

  7. Experiments (Clickstream bipartite graph) Count vs Indegree Count vs Outdegree Hop-plot Singular value vs Rank Left “Network value” Right “Network value” ►R-MAT matches bipartite graphs

  8. Experiments (Epinions undirected graph) Singular value vs Rank Hop-plot Count vs Indegree “Network value” Count vs Stress ►R-MAT matches undirected graphs

  9. Conclusions The R-MAT graph generator • matches the patterns mentioned before • along with DGX/lognormal degree distributions  can be shown theoretically • exhibits a “Community” effect • generates undirected, directed, bipartite and weighted graphs with ease • requires only 3 parameters (a,b,c), • and, is fast and scalable  O(E logN)

  10. The “DGX”/lognormal distribution • Deviations from power-laws have been observed [Pennock+ ’02] • These are well-modeledby the DGX distri-bution [Bi+’01] • Essentially fits aparabola insteadof a line to thelog-log plot. “Drifting” surfers Count “Devoted” surfer Degree Clickstream data

  11. 2n 2n Our Proposed Generator • R-MAT (Recursive MATrix) [SIAM DM’04] • Subdivide the adjacency matrix • and choose one quadrant with probability (a,b,c,d) • Recurse till we reach a 1*1 cell • where we place an edge • and repeat for all edges. a = 0.4 b = 0.15 c = 0.15 d = 0.3

More Related