1 / 34

Multiscale Models for Network Traffic

This paper explores the multiscale nature of network traffic and demonstrates the use of wavelet models for traffic analysis and network inference applications.

farrelly
Download Presentation

Multiscale Models for Network Traffic

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Multiscale Models for Network Traffic Vinay Ribeiro Rolf Riedi, Matt Crouse, Rich Baraniuk Dept. of Electrical Engineering Rice University (Houston, Texas)

  2. Outline • Multiscale nature of network traffic • Wavelets • Wavelet models for traffic • Network inference applications

  3. (discrete time) Time Scales time unit 2n 2 1

  4. Multiscale Nature of Network Traffic time unit Internet bytes/time trace (LBL’93) 600ms • Network traffic (local area networks, wide area networks, video traffic etc.) - variance decays slowly with aggregation • i.i.d. data – variance decays faster with aggregation 60ms i.i.d. time series (lognormal) 6ms

  5. Stationary Gaussian process, Covariance (Hurst parameter: 0<H<1) Long-range dependence (LRD) if ½<H<1 Second-order self-similarity Fractional Gaussian Noise (fGn) Variance-time plot

  6. fGn is a 1/f-Process power • Power spectral density decays in a 1/f fashion • Low frequency components  long-term correlations frequency

  7. Towards Generalizations of fGn Variance-time plot Auckland Univ. Traffic • Variance decay of traffic not always straight line like fGn • Goal: develop LRD models • Generalize fGn • Parsimonious (few parameters) • Fast synthesis for simulations time scale

  8. Wavelets • Consider only orthonormal wavelet basis in L2(R) • Prototype functions approximation function-wavelet function- • Basis formed by scaled and shifted versions of prototype functions • Approximation and wavelet coefficients

  9. The Haar Wavelet Basis

  10. Computing the Haar Transform • Wavelet Transform: fine to coarse (bottom to top) • Inverse Wavelet Transform: coarse to fine (top to bottom)

  11. Wavelets and Filtering • Wavelet coefficients at any scale j is the output of a bandpass filter • Coarse scales  low frequency band • Fine scales  high frequency band • Width of bandpass filters increase exponentially frequency

  12. Wavelets “Decorrelate” 1/f Processes 1/f spectrum power time domain 1/f strong correlation • Analysis of 1/f data • sample means converge faster in wavelet domain • estimate H in wavelet domain • Synthesis of 1/f data • Exploit weak correlation in wavelet domain • Generate independent wavelet coefficients with appropriate variance • Invert wavelet transform wavelet domain not 1/f weak correlation frequency power frequency

  13. Haar Wavelet “Additive” 1/f Model • ChooseWj,ki.i.d. within scale j • Set var(Wj,k) to obtain required decay of var(Vj,k) • FastO(N) synthesis • log2(N) parameters • Asymptotically Gaussian

  14. Sample Realization • Realization is Gaussian and can take negative values • Network traffic may be non-Gaussian and is always positive

  15. Multiplicative Cascade Model • Replace additive innovations by multiplicative innovations • Aj,k2 [0,1],example -distribution • Choose var(Aj,k) to get appropriate decay of var(Vj,k) • FastO(N) synthesis • log2(N) parameters • Positive data • Asymptotically lognormal at fine time scales

  16. Sample Realization • Data is positive • Same var(Vj,k) as additive model

  17. Additive vs. Multiplicative Models time unit 24ms • Multiplicative model marginals closer to real data than additive model 12ms 6ms Internet data (Auckland Univ) Additive model Multiplicative model

  18. Queuing Experiment • Additive and multiplicative models same var(Vj,k) • Multiplicative model outperforms additive model • High-order moments can influence queuing (open loop) multiplicative model real traffic additive model Kilo bytes

  19. Shortcomings of Multiscale Models • Open-loop • Do not capture closed-loop nature of network protocols and user behavior • Physical intuition • Cascades model “redistribution” of traffic (multiplexing at queues, TCP)? • Stationarity: first order stationary but not second-order stationary • Time averaged correlation structure is close to fGn • Queuing of additive model close to stationary Gaussian data (simulations and theory)

  20. Selected References • Self-similar traffic and networks (upto 1996) • W. Willinger, M. Taqqu, A. Erramilli, “A bibliographical guide to self-similar traffic and performance modeling for modern high-speed networks”, Stochastic Networks: Theory and Applications, vol. 4, Oxford Univ. Press, 1996. • Wavelets • S. Burrus and R. Gopinath, “Introduction to Wavelets and Wavelet Transforms”, Prentice Hall, 1998. • I. Daubechies, “Ten lectures on wavelets”, SIAM, New York, 1992. • Additive model • S. Ma and C. Ji, “Modeling heterogeneous network traffic in wavelet domain”, IEEE Trans. Networking, vol. 9, no. 5, Oct 2001. • Multiplicative model • R. Riedi, M. Crouse, V. Ribeiro, R. Baraniuk, “A multifractal wavelet model with application to network traffic”, IEEE Trans. Info. Theory, vol. 45, no. 3, April 1999. • A. Feldmann, A. C. Gilbert, W. Willinger, “Data networks as cascasdes: investigating the multifractal nature of Internet WAN traffic”, ACM SIGCOMM, pp. 42-55, 1998. • P. Mannersalo and I. Norros, “Multifractal analysis of real ATM traffic: A first look”, Technical report, VTT Information Technology, 1997, COST257TD(97)19,

  21. Network Inference Applications

  22. Why Network Inference? • Different parts of Internet owned by different organizations • Information sharing difficult • Commerical interests/trade secrets • Privacy • Sheer volume of “network state” Each dot is one Internet Service Provider

  23. Edge-based Probing • Inject probe packets into network • Infer internal properties from packet delay/loss • Current tools infer • Topology • Link bandwidths • End-to-end available bandwidth • Congestion locations

  24. Cross-Traffic Inference • Simple network path – single queue • Spread of packet pair gives cross-traffic over small time interval 

  25. Inferring cross-traffic over large time interval [0,T] • Probing uncertainty principle • Dense sampling: accurate inference, affect cross-traffic • Sparse sampling: less accurate inference, less influence on cross-traffic

  26. Problem Statement Given N probe pairs, how must we space them over time interval [0,T] to optimally estimate the total cross-traffic in [0,T] • Answer depends on • cross-traffic • optimality criterion

  27. Multiscale Cross-Traffic Model root • Choose Nleaf nodes to give best linear estimate (in terms of mean squared error) of root node • Take a guess! • Bunch probes together • Exponentially space probes pairs • Uniformly space probes over interval • Your favorite solution leaves

  28. Sensor Networks Application Global average • Each sensor samples local value of process (pollution, temperature etc.) • Sensors cost money! • Find best placement for N sensors to measure global average possible sensor location

  29. Independent Innovations Trees • Each node is a linear combination of parent and an independent random innovation • Optimal solution obtained by a water-filling procedure

  30. Water-Filling • : arbitrary set of leaf nodes; : size of X • : leaves on left, : leaves on right • : linear min. mean sq. error of estimating root using X 0 1 2 4 3 N= • Repeat at next lower scale with N • replaced by l*N(left) and (N-l*N) (right) • Result: If innovations identically • distributed within each scale then • uniformly distribute leaves, l*N=b N/2 c fL(l) fR(l) 0 1 2 3 4 0 1 2 3 4

  31. Covariance Trees • Distance : Two leaf nodes have distance j if their lowest common ancestor is at scale j • Covariance tree : Covariance between leaf nodes with distance j is cj(only a function of distance), covariance between root and any leaf node is constant,  • Positively correlated tree : cj>cj+1 • Negatively correlated tree : cj<cj+1

  32. Covariance Tree Result • Result:For a positively correlated tree choosing leaf nodes uniformly in the tree is optimal. However, for negatively correlated trees this same uniform choice is the worst case! • Optimality proof:Simply construct an independent innovations tree with similar correlation structure • Worst case proof: The uniform choice maximizes sum of elements of SX Using eigen analysis show that this implies that uniform choice minimizes sum of elements of S-1X

  33. Future Directions • Sampling • More general tree structures • Non-linear estimates • Non-tree stochastic processes • Traffic estimation • More complex networks • Sensor networks • jointly optimize with other constraints like power transmission

  34. References • Estimation on multiscale trees • A. Willsky, “Multiresolution Markov models for signal and image processing”, Proc. of the IEEE 90(8), August 2002. • Optimal sampling on trees • V. Ribeiro, R. Riedi, and R. Baraniuk, “Optimal sampling strategies for multiscale models and their application to computer networks”, preprint.

More Related