400 likes | 572 Views
Ongoing work on Multiresolution Analysis of Traffic Matrices. David Rincón Matthew Roughan. EuroNFTraf 2009 workshop – Paris, 7-8 december 2009. Outline. Introduction Seeking a sparse model for TMs Multi-Resolution Analysis on graphs with Diffusion Wavelets MRA of TMs: preliminary results
E N D
Ongoing work onMultiresolution Analysis of Traffic Matrices David Rincón Matthew Roughan EuroNFTraf 2009 workshop – Paris, 7-8 december 2009
Outline • Introduction • Seeking a sparse model for TMs • Multi-Resolution Analysis on graphs with Diffusion Wavelets • MRA of TMs: preliminary results • Conclusions and open issues
Traffic matrices • Basic input for network planning, dimensioning, traffic engineering, etc • Direct inspection • Netflow-like solutions • Difficult to obtain: experiment setting, router performance… • Indirect estimation: Inference • SNMP link traffic volumes (5 minutes) • y=Ax • Underconstrained problem – need of models • Gravity model: Traffic exchanged between two nodes is proportional to the total traffic entering/exiting the nodes • Prior + regularization
Traffic matrices • Open problems • Need of good TM models • Synthesis of TMs for planning / design of networks • Traffic prediction – anomaly detection • Traffic engineering algorithms • Traffic and topology are intertwined • Hierarchical scales in the global Internet apply also to traffic • How to reduce the dimensionality catch of the inference problem?
AS1 AS2 AS3 Network/AS PoPs Access Networks Context: topology • Spatial hierarchy
Our goal: can we find a general model for TMs? • Criterion: the TM model should be sparse • Sparsity: energy concentrates in few coefficients (M << N2) • Tradeoff between predictive power and model fidelity • Easier to attach physical meaning • Could help with the underconstrained inference problem • Multiresolution analysis (MRA) • “Classical MRA”: wavelet transforms observe the data at different time / space resolutions • Wavelets (approximately) decorrelate input signals • Energy concentrates in few coefficients • Threshold the transform coefficients sparse representation (denoising, compression) • Successfully applied in time series (1D) and images (2D)
V0 W1 V1 V2 W2 W3 V3 Multi-Resolution Analysis • Intuition: “to observe at different scales” • Approximations: coarse representations of the original data
Wavelet transform example • 2D wavelet decomposition of the image for j=2 levels • Vertical/horizontal high/low frequency subbands
MRA on graphs? • A TM is not an image • Image = uniform sampling or R2 • The TM is defined on a graph (manifold) • Example: swiss roll • Available MRA techniques • Graph wavelets (Crovella & Kolaczyck, 2003) • Sampled 2D wavelets • Non-orthogonal, lack of fast algorithm • Diffusion Wavelets (Maggioni & Coifman, 2006) • Orthogonal, meaningful MRA on graphs
Diffusion Wavelets(Coifman, Maggioni 2006) • Diffusion operator • A diffusion operator T “learns” the underlying geometry • Tk represents the probability of a transition in k time steps • Example (Coifman, Lafon 2006): • 3 clusters, 300 random Gaussian-dist points, with
Cv2 5 CW2 3 CW1 2 How to perform MRA on TMs? Eigenspectrum of T (normalized) Operator T (10x10 matrix) W1 V1 W2 V2 Eigenvalues (low to high frequency)
Diffusion Wavelets and our goals • Unidimensional functions of the vertices F(v1) can be projected onto the multi-resolution spaces defined by the DW. • Network topology can be studied by defining a random- walk-like diffusion operator and representing the coarsened versions of the graph. • But Traffic Matrices are 2D functions of the origin and destination vertices, and can also be functions of time: TM(V1,V2,t)
2D Diffusion wavelets Operator T • Extension of DW to 2D functions defined on a graph • F(v1,v2) • Construction of separable 2D bases by “projecting twice” into both “directions” • Tensor product • Similar to 2D DWT • Orthonormal, invertible, energy conserving transform WW1 VW1 WV1 VV1 WW2 VW2 WV2 VV2 WW3 VW3 WV3 VV3
2D Diffusion wavelets Operator T • Extension of DW to 2D functions defined on a graph WW1 VW1 WV1 VV1 WW2 VW2 WV2 VV2
MRA of Traffic Matrices • More than 20000 TMs from operational networks • Abilene (2004), granularity 5 mins • GÉANT (2005), granularity 15 mins • Acknowledgments: Yin Zhang (UTexas), S. Uhlig (Delft), • Adjacency operator: • A: unweighted adjacency matrix • “Symmetrised” version of the random walk – same eigenvalues • Precision ε = 10-7
1 1 2 1 4 2 0 0 1 12 12 10 3 5 2 6 2D Diffusion wavelets – Abilene example V0 12 V4 6 W1 V1 W5 V5 W2 V2 W6 V6 W3 V3 W7 V7 W4 V4 W8 V8 # eigenvalues at each subspace Wj = WVj + VWj + WWj
2D Diffusion wavelets – Abilene example STTL SNVA DNVR LOSA KSCY HSTN IPLS ATLA CHIN NYCM WASH ATLA-M5
2D Diffusion wavelets – Abilene example DW coefficients Abilene 14th July 2004 (24 hours) Time (5 min intervals) Coefficient index (high to low freq)
2D Diffusion wavelets – Abilene example • How concentrated is the energy of the TM? • Wavelet coefficients for the Abilene TM • 12 x 12 = 144 coefficients Coefficients – high to low frequency
Coefficient rank – Abilene March 2004 Time (5 min intervals) Coefficient index Rank signature
Other operators • Gravity operator • G: normalized gravity model (rank 1) from fan-out and fan-in probabilities • Needs symmetrisation (undirected graph) • Actual operator: Max-eig-normalized T (non-stochastic)
Gravity operator Gravity operator Topology operator Coefficients – high to low frequency
Gravity operator 1.4% coeff 80% 4.9% coeff 90% 11% coeff 95%
Conclusions and open issues • Representation of TMs in the DW domain • TMs in the DW domain seem to be sparse (compressible) • Consistency along time • Ongoing work • Develop a sparse model for TMs • Exploit DW’s dimensionality reduction in the inference problem • Exploring weighted / routing-related diffusion operators • Introducing time correlations in the diffusion operator • Diffusion wavelet packets – best basis algorithms for compression • DW analysis of network topologies
2 (or 2/3) 1 Mb/s (0.4) 1 (or 1/3) 1.5 Mb/s (0.6) Actual operators: Max-eig-normalized T (non-stochastic) Normalized -style Flow/traffic operators
Thank you ! Questions?
MRA of TMs: Why? • Applications of MRA in Signal Processing • Denoising • Keep the low-frequency components, discard the high-frequency details • Compression • Keep the best coefficients for highest perceptual quality • Potential applications for TMs • “Denoising” • “Compression“ – express a TM with few coefficients • Lower-dimension model of the TM, easier to predict/analyze • Could this help with the inference problem?
2 (or 2/3) 1 Mb/s (0.4) 1 (or 1/3) 1.5 Mb/s (0.6) Actual operators: Max-eig-normalized T (non-stochastic) Normalized -style Flow/traffic operators Traffic operator Flow operator
Géant 23 nodes (2005)
The tools: Graph wavelets • Graph wavelets for spatial traffic analysis (Crovella & Kolaczyk 03) • Exploit spatial correlation of traffic data • Sampled 2D wavelets
The tools: Graph wavelets • Graph wavelets for spatial traffic analysis (Crovella & Kolaczyk 03) • Link analysis • Definition of scale j: j-hop neighbours
traffic j=1 j=3 j=5 The tools: Graph wavelets • Graph wavelets for spatial traffic analysis (Crovella & Kolaczyk 03) • Anomaly detection in Abilene
Multi-Resolution Analysis • Scaling functions: averaging, low-frequency functions • Wavelet functions: differencing, high-frequency functions
Multi-Resolution Analysis (2D) • Separable bases: horizontal x vertical • Example: 2D scaling function
Diffusion wavelets • Eigenvalues of the diffusion operator • Every operator can be defined in terms of its eigenspectrum • Eigenvalues λi, eigenvectors vi • 0 ≤ |λi| ≤ 1 • Eigenvalues of Tk= λik • Amount of “important” eigenvalues vectors decreases with k • Those under certain precision related to high-frequency detail • Those over are related to low-frequency approximations