1 / 59

DeltaCon : A Principled Massive-Graph Similarity Function

DeltaCon : A Principled Massive-Graph Similarity Function. Danai Koutra Joshua T. Vogelstein Christos Faloutsos. SDM, 2-5 May 2013, Texas-Austin, USA. Problem Definition: Graph Similarity. G A. Given : (i) 2 graphs with the same nodes and different edge sets

dani
Download Presentation

DeltaCon : A Principled Massive-Graph Similarity Function

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. DeltaCon: A Principled Massive-Graph Similarity Function Danai Koutra Joshua T. Vogelstein Christos Faloutsos SDM, 2-5 May 2013, Texas-Austin, USA

  2. Problem Definition:Graph Similarity GA • Given: (i) 2 graphs with the same nodes and different edge sets (ii) node correspondence • Find: similarity score s [0,1] GB Danai Koutra (CMU)

  3. Problem Definition:Graph Similarity GA • Given: (a) 2 graphs with the same nodes and different edge sets (b) node correspondence • Find: similarity score, s [0,1] s = 0: GA <> GB s = 1: GA == GB GB Danai Koutra (CMU)

  4. Motivation (1) Classification 1 different brain wiring? Discontinuity Detection 2 Day 1 Day 2 Day 3 Day 4 Day 5 Danai Koutra (CMU)

  5. Motivation (2) Behavioral Patterns 3 FB message graph vs. wall-to-wall network 4 Intrusion detection Danai Koutra (CMU)

  6. Problem: Graph Similarity Is there any obvious solution? Danai Koutra (CMU)

  7. One Solution GA Edge Overlap (EO) # of common edges (normalized or not) GB Danai Koutra (CMU)

  8. … but “barbell”… EO(B10,mB10) ==EO(B10,mmB10) GA GA GB GB’ Danai Koutra (CMU)

  9. Contributions Delta Connectivity Theory • Axioms • Desired Properties Practice • DeltaCon algorithm • Real-world applications • Experiments on synthetic & real graphs Danai Koutra (CMU)

  10. Roadmap • Intuition • Axioms & Properties • Proposed Algorithm: DeltaCon • Applications • Experiments • Related Work • Conclusions Danai Koutra (CMU)

  11. Intuition (1) GA STEP 1: Compute the pairwise node influence, SA& SB SA= GB SB = Danai Koutra (CMU)

  12. Intuition (2) STEP 2: Find the similarity between SA & SB. SA= SB = Danai Koutra (CMU)

  13. Intuition (2) STEP 2: Find the similarity between SA & SB. sim(SA , SB) = 0.3 SA= SB = Danai Koutra (CMU)

  14. Roadmap • Intuition • Axioms & Properties • Proposed Algorithm: DeltaCon • Applications • Experiments • Related Work • Conclusions Danai Koutra (CMU)

  15. … many similarity functions can be defined… But … … what properties should a good similarity function have? Danai Koutra (CMU)

  16. Axioms A1.Identity property sim( , ) = 1 A2.Symmetric property sim(, ) = sim(, ) A3.Zero property sim(, ) = 0 Danai Koutra (CMU)

  17. Roadmap • Intuition • Axioms & Properties • Proposed Algorithm: DeltaCon • Applications • Experiments • Related Work • Conclusions Danai Koutra (CMU)

  18. Desired Properties (1) • Intuitiveness P1. Edge Importance P2. Weight Awareness P3. Edge-“Submodularity” P4. Focus Awareness • Scalability Danai Koutra (CMU)

  19. Desired Properties (2) • Intuitiveness P1. Edge Importance P2. Weight Awareness P3. Edge-“Submodularity” P4. Focus Awareness • Scalability Creation of disconnected components matters more than small connectivity changes. Danai Koutra (CMU)

  20. Desired Properties (3) w=1 • Intuitiveness P1. Edge Importance P2. Weight Awareness P3. Edge-“Submodularity” P4. Focus Awareness • Scalability ✗ w=5 ✗ The bigger the edge weight, the more the edge change matters. Danai Koutra (CMU)

  21. Desired Properties (4) n=5 GA • Intuitiveness P1. Edge Importance P2. Weight Awareness P3. Edge-“Submodularity” P4. Focus Awareness • Scalability GB GA GB “Diminishing Returns”: The sparser the graphs, the more important is a ‘’fixed’’ change. Danai Koutra (CMU)

  22. Desired Properties (1) random GB GA • Intuitiveness P1. Edge Importance P2. Weight Awareness P3. Edge-“Submodularity” P4. Focus Awareness • Scalability targeted GB’ Targeted changes are more important than randomchanges of the same extent. Danai Koutra (CMU)

  23. How do state-of-the-art methods fare? edge weight returns focus Danai Koutra (CMU)

  24. Roadmap • Intuition • Axioms & Properties • Proposed Algorithm: DeltaCon • Experiments • Applications • Related Work • Conclusions Danai Koutra (CMU)

  25. Proposed algorithm: DeltaCon0 BASE ALGO • Find the pairwise node influence, SA& SB. SA= SB = Danai Koutra (CMU)

  26. STEP 1: How to compute node influence? • A1: Pagerank • A2: Personalized Random Walk with Restart (RWR) • A3: Lazy RWR • A4: “Electrical network analogy” - resistances • A5: Belief Propagation FaBP • … Danai Koutra (CMU)

  27. STEP 1: Intuition of BP BACKGROUND iterative message-based method Iteration 1 Iteration 2 e.g., CS person 0 0 0 Danai Koutra (CMU)

  28. STEP 1: Fast BP (1) BACKGROUND ithrow 1 0 1 1 1 0 0 1 1 ? 0 1 0 1 1 1 d1 d2 d3 similar to RWR Danai Koutra (CMU)

  29. STEP 1: Fast BP (1) BACKGROUND ithrow 1 0 1 1 1 0 0 1 1 ? 0 1 0 1 1 1 d1 d2 d3 strength of influence between neighbors similar to RWR Danai Koutra (CMU)

  30. STEP 1: Fast BP (1) BACKGROUND ithrow 1 0 1 1 1 0 0 1 1 ? 0 1 0 1 1 1 d1 d2 d3 strength of influence between neighbors final influence from node i similar to RWR Danai Koutra (CMU)

  31. STEP 1: Fast BP (2) ithrow 1 0 1 1 1 0 0 1 1 ? 0 1 0 1 1 1 d1 d2 d3 ORpairwise influence matrix: 1 0.2 0.1 0.3 1 0.2 0 0.5 1 Danai Koutra (CMU)

  32. STEP 1: Why FaBP? DETAILS • Sound theoretical background (MLE on marginals) • Fast: linear on the edges • Attenuating Neighboring Influence Danai Koutra (CMU)

  33. STEP 1: Why FaBP? INTUITION • Sound theoretical background (MLE on marginals) • Fast: linear on the edges • Attenuating Neighboring Influence for small ε: 1-hop 2-hops … ε > ε2 > ... 0<ε<1 Danai Koutra (CMU)

  34. Proposed algorithm: DeltaCon0 BASE ALGO • Find the pairwise influence (FaBP), SA& SB. • Find distance. SA= SA,SB = Matusita distance SB = Danai Koutra (CMU)

  35. Proposed algorithm: DeltaCon0 BASE ALGO • Apply FaBP to find the pairwise influence matrices, SA& SB. • Find distance. • Find similarity, SA= SA,SB = Matusita distance SB = Danai Koutra (CMU)

  36. … but O(n2) … f a ster? Danai Koutra (CMU)

  37. Proposed Algorithm:DeltaCon – step 1 (1) FASTER ALGO 2 1 1a Create gdisjoint & coveringnode groups. 3 Adjacency matrix A= 4 1 2 3 4 Danai Koutra (CMU)

  38. Proposed Algorithm:DeltaCon – step 1 (2) FASTER ALGO 2 1 1a Create gdisjoint & covering node groups. 1b For group i, find node-group influence (FaBP) 3 4 Danai Koutra (CMU)

  39. Proposed Algorithm:DeltaCon – step 1 (3) INTUITION S’A= 1be.g., for group 1, find node-group influence (FaBP): SA= g rou p s 1234 row-wise 1 2 3 4 Danai Koutra (CMU)

  40. Proposed Algorithm:DeltaCon – step 1 (4) FASTER ALGO S’A= S’B= 2 1 1a Create gdisjoint & covering node groups. 1b For group i, find node-group influence (FaBP) 1c Create node-group influence matrices, S’A& S’B. 3 4 g rou p s 1234 1234 Danai Koutra (CMU)

  41. Proposed Algorithm:DeltaCon (5) FASTER ALGO S’A= S’B= 2 1 1a Create gdisjoint & covering node groups. 1b For group i, find node-group influence (FaBP) 1c Create node-group influence matrices, S’A& S’B. 3 4 g rou p s 1234 1234 + Steps 2 & 3 as before Danai Koutra (CMU)

  42. Roadmap • Intuition • Axioms & Properties • Proposed Algorithm: DeltaCon • Applications • ENRON: anomaly detection • Brain Graphs: clustering • Experiments • Conclusions Danai Koutra (CMU)

  43. Temporal Anomaly Detection in ENRON (1) • Nodes: employees • Edges: email exchange • DeltaCon similarities of consecutive timestamps sim1 sim2 sim3 sim4 Day 1 Day 2 Day 3 Day 4 Day 5 Danai Koutra (CMU)

  44. Temporal Anomaly Detection in ENRON (2) IMR similarity consecutive days Danai Koutra (CMU)

  45. Roadmap • Intuition • Axioms & Properties • Proposed Algorithm: DeltaCon • Applications • ENRON: anomaly detection • Brain Graphs: clustering • Experiments • Related Work • Conclusions Danai Koutra (CMU)

  46. Brain Connectivity Graph Clustering (1) • 114 aligned connectomes (FMRI) • Nodes: 70 cortical regions • Edges: connections • Attributes: gender, IQ, age… Danai Koutra (CMU)

  47. Brain Connectivity Graph Clustering (2) • pairwiseDeltaCon similarities • hierarchical clustering • t-test / ANOVA for given attributes Ward’s linkage Danai Koutra (CMU)

  48. Brain Connectivity Graph Clustering (3) High CCI t-test / ANOVA for given attributes p-value = 0.0057 Low CCI Danai Koutra (CMU)

  49. Roadmap • Intuition • Axioms & Properties • Proposed Algorithm: DeltaCon • Applications • Experiments • Scalability • Conclusions Danai Koutra (CMU)

  50. Scalability SLOPE = 1 runtime (min) # of edges = max{m1,m2} # of edges in GA & GB # of nodes Dataset: Kronecker graphs DeltaConis linear on the edges + groups; O(g×n + g×(m1+m2). Danai Koutra (CMU)

More Related