Multiscale Models for Network Traffic

Multiscale Models for Network Traffic Vinay Ribeiro Rolf Riedi, Matt Crouse, Rich Baraniuk Dept. of Electrical Engineering Rice University (Houston, Texas)

Outline • Multiscale nature of network traffic • Wavelets • Wavelet models for traffic • Network inference applications

(discrete time) Time Scales time unit 2n 2 1

Multiscale Nature of Network Traffic time unit Internet bytes/time trace (LBL’93) 600ms • Network traffic (local area networks, wide area networks, video traffic etc.) - variance decays slowly with aggregation • i.i.d. data – variance decays faster with aggregation 60ms i.i.d. time series (lognormal) 6ms

Stationary Gaussian process, Covariance (Hurst parameter: 0<H<1) Long-range dependence (LRD) if ½<H<1 Second-order self-similarity Fractional Gaussian Noise (fGn) Variance-time plot

fGn is a 1/f-Process power • Power spectral density decays in a 1/f fashion • Low frequency components  long-term correlations frequency

Towards Generalizations of fGn Variance-time plot Auckland Univ. Traffic • Variance decay of traffic not always straight line like fGn • Goal: develop LRD models • Generalize fGn • Parsimonious (few parameters) • Fast synthesis for simulations time scale

Wavelets • Consider only orthonormal wavelet basis in L2(R) • Prototype functions approximation function-wavelet function- • Basis formed by scaled and shifted versions of prototype functions • Approximation and wavelet coefficients

The Haar Wavelet Basis

Computing the Haar Transform • Wavelet Transform: fine to coarse (bottom to top) • Inverse Wavelet Transform: coarse to fine (top to bottom)

Wavelets and Filtering • Wavelet coefficients at any scale j is the output of a bandpass filter • Coarse scales  low frequency band • Fine scales  high frequency band • Width of bandpass filters increase exponentially frequency

Wavelets “Decorrelate” 1/f Processes 1/f spectrum power time domain 1/f strong correlation • Analysis of 1/f data • sample means converge faster in wavelet domain • estimate H in wavelet domain • Synthesis of 1/f data • Exploit weak correlation in wavelet domain • Generate independent wavelet coefficients with appropriate variance • Invert wavelet transform wavelet domain not 1/f weak correlation frequency power frequency

Haar Wavelet “Additive” 1/f Model • ChooseWj,ki.i.d. within scale j • Set var(Wj,k) to obtain required decay of var(Vj,k) • FastO(N) synthesis • log2(N) parameters • Asymptotically Gaussian

Sample Realization • Realization is Gaussian and can take negative values • Network traffic may be non-Gaussian and is always positive

Multiplicative Cascade Model • Replace additive innovations by multiplicative innovations • Aj,k2 [0,1],example -distribution • Choose var(Aj,k) to get appropriate decay of var(Vj,k) • FastO(N) synthesis • log2(N) parameters • Positive data • Asymptotically lognormal at fine time scales

Sample Realization • Data is positive • Same var(Vj,k) as additive model

Additive vs. Multiplicative Models time unit 24ms • Multiplicative model marginals closer to real data than additive model 12ms 6ms Internet data (Auckland Univ) Additive model Multiplicative model

Queuing Experiment • Additive and multiplicative models same var(Vj,k) • Multiplicative model outperforms additive model • High-order moments can influence queuing (open loop) multiplicative model real traffic additive model Kilo bytes

Shortcomings of Multiscale Models • Open-loop • Do not capture closed-loop nature of network protocols and user behavior • Physical intuition • Cascades model “redistribution” of traffic (multiplexing at queues, TCP)? • Stationarity: first order stationary but not second-order stationary • Time averaged correlation structure is close to fGn • Queuing of additive model close to stationary Gaussian data (simulations and theory)

Selected References • Self-similar traffic and networks (upto 1996) • W. Willinger, M. Taqqu, A. Erramilli, “A bibliographical guide to self-similar traffic and performance modeling for modern high-speed networks”, Stochastic Networks: Theory and Applications, vol. 4, Oxford Univ. Press, 1996. • Wavelets • S. Burrus and R. Gopinath, “Introduction to Wavelets and Wavelet Transforms”, Prentice Hall, 1998. • I. Daubechies, “Ten lectures on wavelets”, SIAM, New York, 1992. • Additive model • S. Ma and C. Ji, “Modeling heterogeneous network traffic in wavelet domain”, IEEE Trans. Networking, vol. 9, no. 5, Oct 2001. • Multiplicative model • R. Riedi, M. Crouse, V. Ribeiro, R. Baraniuk, “A multifractal wavelet model with application to network traffic”, IEEE Trans. Info. Theory, vol. 45, no. 3, April 1999. • A. Feldmann, A. C. Gilbert, W. Willinger, “Data networks as cascasdes: investigating the multifractal nature of Internet WAN traffic”, ACM SIGCOMM, pp. 42-55, 1998. • P. Mannersalo and I. Norros, “Multifractal analysis of real ATM traffic: A first look”, Technical report, VTT Information Technology, 1997, COST257TD(97)19,

Network Inference Applications

Why Network Inference? • Different parts of Internet owned by different organizations • Information sharing difficult • Commerical interests/trade secrets • Privacy • Sheer volume of “network state” Each dot is one Internet Service Provider

Edge-based Probing • Inject probe packets into network • Infer internal properties from packet delay/loss • Current tools infer • Topology • Link bandwidths • End-to-end available bandwidth • Congestion locations

Cross-Traffic Inference • Simple network path – single queue • Spread of packet pair gives cross-traffic over small time interval 

Inferring cross-traffic over large time interval [0,T] • Probing uncertainty principle • Dense sampling: accurate inference, affect cross-traffic • Sparse sampling: less accurate inference, less influence on cross-traffic

Problem Statement Given N probe pairs, how must we space them over time interval [0,T] to optimally estimate the total cross-traffic in [0,T] • Answer depends on • cross-traffic • optimality criterion

Multiscale Cross-Traffic Model root • Choose Nleaf nodes to give best linear estimate (in terms of mean squared error) of root node • Take a guess! • Bunch probes together • Exponentially space probes pairs • Uniformly space probes over interval • Your favorite solution leaves

Sensor Networks Application Global average • Each sensor samples local value of process (pollution, temperature etc.) • Sensors cost money! • Find best placement for N sensors to measure global average possible sensor location

Independent Innovations Trees • Each node is a linear combination of parent and an independent random innovation • Optimal solution obtained by a water-filling procedure

Water-Filling • : arbitrary set of leaf nodes; : size of X • : leaves on left, : leaves on right • : linear min. mean sq. error of estimating root using X 0 1 2 4 3 N= • Repeat at next lower scale with N • replaced by l*N(left) and (N-l*N) (right) • Result: If innovations identically • distributed within each scale then • uniformly distribute leaves, l*N=b N/2 c fL(l) fR(l) 0 1 2 3 4 0 1 2 3 4

Covariance Trees • Distance : Two leaf nodes have distance j if their lowest common ancestor is at scale j • Covariance tree : Covariance between leaf nodes with distance j is cj(only a function of distance), covariance between root and any leaf node is constant,  • Positively correlated tree : cj>cj+1 • Negatively correlated tree : cj<cj+1

Covariance Tree Result • Result:For a positively correlated tree choosing leaf nodes uniformly in the tree is optimal. However, for negatively correlated trees this same uniform choice is the worst case! • Optimality proof:Simply construct an independent innovations tree with similar correlation structure • Worst case proof: The uniform choice maximizes sum of elements of SX Using eigen analysis show that this implies that uniform choice minimizes sum of elements of S-1X

Future Directions • Sampling • More general tree structures • Non-linear estimates • Non-tree stochastic processes • Traffic estimation • More complex networks • Sensor networks • jointly optimize with other constraints like power transmission

References • Estimation on multiscale trees • A. Willsky, “Multiresolution Markov models for signal and image processing”, Proc. of the IEEE 90(8), August 2002. • Optimal sampling on trees • V. Ribeiro, R. Riedi, and R. Baraniuk, “Optimal sampling strategies for multiscale models and their application to computer networks”, preprint.

Multiscale Models for Network Traffic

Multiscale Models for Network Traffic

Presentation Transcript

Traffic Simulation Models

Network-Wide Traffic Models for Managing IP Networks

Why Multiscale Models

Traffic Flow models for Road Networks

Multiscale Traffic Processing Techniques for Network Inference and Control

Traffic Flow models for Road Networks

Adaptive Traffic Light Control For Traffic Network

Multiscale Traffic Processing Techniques for Network Inference and Control

Multiscale Traffic Processing Techniques for Network Inference and Control

NETWORK MODELS

Mixture Models of End-host Network Traffic

Multiscale Traffic Processing Techniques for Network Inference and Control

Multiscale Models for Microstructure Evolution and Response

Network Models

Network Models

Multiscale Traffic Processing Techniques for Network Inference and Control

Arterial Traffic Models

Congestion models for bursty TCP traffic

Multiscale Traffic Processing Techniques for Network Inference and Control

Network Models

Multiscale Models for Microstructure Evolution and Response