150 likes | 270 Views
Morteza Mardani and Georgios Giannakis ECE Department, University of Minnesota Acknowledgments : MURI (AFOSR FA9550-10-1-0567) grant. Robust Network Traffic Estimation via Sparsity and Low Rank. Vancouver, Canada May 31, 2013. 1. T raffic monitoring. Backbone of IP networks.
E N D
MortezaMardani and GeorgiosGiannakis ECE Department, University of Minnesota Acknowledgments: MURI (AFOSR FA9550-10-1-0567) grant Robust Network Traffic Estimation via Sparsity and Low Rank Vancouver, Canada May 31, 2013 1
Traffic monitoring • Backbone of IP networks • Traffic anomalies: changes in origin-destination (OD) flows • Failures, transient congestions, DoS attacks, intrusions, flooding • The vision:atlas of anomalies and nominal trafficfor network management • The means: leverage sparsityand low rank • Complexity control through parsimonious modeling • Robustness to anomalies
Model Anomaly є {0,1} LxT LxF • Graph G(N, L) with N nodes, L links, and F flows (F >> L) • (as) Single-path per OD flow zf,t • Packet counts per link l and time slot t • Matrix model across T time slots: fat
Low rank of traffic matrix • Z: traffic matrix has low rank, e.g., [Lakhina et al‘04] Data: http://math.bu.edu/people/kolaczyk/datasets.html
Sparsity of anomaly matrix • A: anomaly matrix is sparse across both time and flows Time Flows
Robust tomography • Goal: Find a map of nominal traffic Z and anomalies A • useful for network management tasks • Challenge: impractical to directly measure zf,t • Huge number of OD pairs ( ≈ N2 ) • Potential anomalies Transportation networks • Available data: link counts Yplus priori knowledge on Z Computer networks • Prior art • Least-squares and Gaussian models [Cascetta’84], [Zhao et al ’06] • Poisson models [Vardi’96]; and entropy minimization [Zuylen’80]
Problem statement SNMP • Recovery from link counts • Seriously ill-posed FT+ FT >> LT • Nullspace of Rincludes low-rank matrices • Partial NetFlow measurements • Goal: Given and find sparse A and low-rank Z (P2)
Recovery guarantees (P3) • Noise-free model and estimator Theorem: Given {Y,Pп(U),R,п} if every column of A0 has at most knonzero entries, and I)-II) hold, then Ǝ λϵ [λmin, λmax] for which (P3) exactly recovers {Z0,A0}.
Practical implications • Accurate estimation possible if • Nominal traffic sufficiently low dimensional • Anomalies sporadic across time and flows • OD node pairs distant and routing paths sufficiently spread out • NetFlow samples sufficiently many distinct OD flows
Exact recovery validation π=0.05 π=0.1 • Setup • L=105, F=210, T = 420 • R ~ Bernoulli(1/2) • Z0= PQ’, P, Q ~ N(0, 1/√FT) aijϵ {-1,0,1} w.p. {ρ/2,1- ρ,ρ /2} Πijϵ {0,1} w.p. {1-π, π}
Internet2 data • Real network data • Dec. 8-28, 2008 • N=11, L=41, F=121, T=504 • 10% of flow counts • 45% gain for nominal traffic • 18% gain for anomalous traffic ---- estimated ---- real Data: http://www.cs.bu.edu/~crovella/links.html
Conclusions Thank You! • Spatiotemporal correlation of traffic and sporadic nature of anomalies • Estimated map of nominal traffic and anomalies • Exact recovery of unknown low-rank and sparse matrices • Deterministic sufficient conditions • Angle between certain subspaces • Ongoing research • Tradeoff between OD flow and link counts • Finding simpler conditions for random ensembles 12
Ongoing research (Satisfiability) • Random ensembles • Uniform sparse A • Random orthogonal model for Z • Row orthonormal compression matrix R • Uniformly random sampling for PΠ(.) • How to find a fairly tight probabilistic bound for • Tradeoff between required OD flow count and link count 13
Identifiability issues • Misidentification if • low rank and sparse • Perturbation in the nullspace • Nullspaces • Subspaces ( ) Rank preserving Sparsity preserving
Incoherence measures Lemma: [Local identifiability] Given and , is unique if and only if C1) C2) and • Incoherence parameter S2 • • Non-spiky singular values S1 • Intersection between nullspaces θ=cos-1(μ)