200 likes | 487 Views
Non-Negative Residual Matrix Factorization w/ Application to Graph Anomaly Detection. Hanghang Tong and Ching-Yung Lin. April 28-30, 2011. Large Graphs are Everywhere!. -----. Q: How to find patterns? e.g., community, anomaly, etc. Terrorist Network [Krebs 2002]. Food Web [2007].
E N D
Non-Negative Residual Matrix Factorization w/ Application to Graph Anomaly Detection Hanghang Tong and Ching-Yung Lin SIAM-DM 2011, Mesa AZ, USA, April 28-30, 2011
Large Graphs are Everywhere! ----- • Q: How to find patterns? • e.g., community, anomaly, etc. Terrorist Network [Krebs2002] Food Web [2007] Internet Map [Koren 2009] Social Network [Newman 2005] Protein Network [Salthe2004] Web Graph
Matrix Tool for Finding Graph Patterns • A Typical Procedure: Residual matrix Low-rank matrices Adj. Matrix A Graph A = F x G + R 3
Matrix Tool for Finding Graph Patterns • A Typical Procedure: Residual matrix Low-rank matrices Adj. Matrix A Graph A = F x G + R anomalies community An Illustrative Example 4
Improve Interpretation by Non-negativity • A Typical Procedure: • An Example Interpretation by Non-negativity community Non-negative Matrix Factorization F >= 0; G >= 0 (for community detection) Adjacency Matrix A A = F x G + R Graph anomalies Non-negative Residual Matrix Factorization R(i,j) >= 0; for A(i,j) > 0 (for anomaly detection) This Paper 5
Anomaly Detection on Graphs • Social Networks • `Popularity contest’ • Computer Networks • Spammer, Port Scanner, Vulnerable Machines, etc • Financial Transaction Networks • Fraud transaction (e.g., money-laundry ring), scammer • Criminal Networks • New criminal trend • Tele-communication Networks • Tele-marketer Key Observation: Abnormal Behavior Actual Activities
Optimization Formulation Weighted Frobenius Form Common in Any Matrix Factorization Weight • General Case 8
Optimization Formulation Weighted Frobenius Form Common in Any Matrix Factorization Weight Unique in This Paper Non-negative residual • General Case 9
Optimization Formulation • 0/1 Weight Matrix (Major Focus of the Paper) 0/1 weight Common in Any Matrix Factorization Unique in This Paper Non-negative residual
Optimization Formulation with 0/1 Weight Matrix • NrMF with 0/1 Weight Matrix • Q: How to find ‘optimal’ F and G? • D1: Quality C1: non-convexity of opt. objective • D2: Scalability C2: large size of the graph 11
Optimization Method: Batch Mode • Basic Idea 1: Alternating • Basic Idea 2: Separation Not convex wrt F and G, jointly But convex if fixing either F or G argminG s.t.. argminG s.t.. i, For each j Standard Quadratic Programming Prob. Overall Complexity: Polynomial Can we do better? 12
Optimization Method: Incremental Mode Adjacency Matrix A • Basic Idea 1: Recursive • Basic Idea 2: Alternating • Basic Idea 3: Separation Initialize: R=A Rank-1 Approximation Do r times QP for a single variable w/ boundary constrains Update Residual Matrix R Can be solved in constant time Output Final Residual Matrix Overall Complexity: Linear wrt # of edges 13
Experimental Evaluation Effectiveness Efficiency Accuracy Wall-clock Time Anomaly Type # of edges 14
Batch Method vs. Incremental Method Log Wall-clock time (sec.) Batch Method Incremental Method Data Set 16
Conclusion • Problem Formulation: Non-negative Residual Matrix Factorization • a new matrix factorization for interpretable graph anomaly detection • Optimization Methods • Batch: straight-forward, polynomial time complexity • Incremental: linear time complexity • Future Work • Other interpretable properties (sparseness) for anomaly detection • Matrix Factorization w/ Total Non-negativity 17
Thank you! htong@us.ibm.com (We are hiring at IBM Research!) 18
low q up q low up