Interesting Links

Interesting Links

On the Self-Similar Nature of Ethernet TrafficWill E. Leland, Walter Willinger and Daniel V. Wilson BELLCOREMurad S. Taqqu BU Analysis and Prediction of the Dynamic Behavior of Applications, Hosts, and Networks

Overview • What is Self Similarity? • Ethernet Traffic is Self-Similar • Source of Self Similarity • Implications of Self Similarity

Background • Network traffic did not obey Poisson assumptions used in queuing analysis • This paper, for the first time, provided an explanation and a systematic approach to modeling realistic data traffic patterns • Sparked off research around the globe: • Results show self-similarity in ATM traffic, compressed digital video streams, and Web Traffic

Why is Self-Similarity Important? • In this paper, Ethernet traffic has been identified as being self-similar. • Models like Poisson are not able to capture the self-similarity property. • This leads to inaccurate performance evaluation

Section 1: What is Self-Similarity ?

What is Self-Similarity? • Self-similarity describes the phenomenon where a certain property of an object is preserved with respect to scaling in space and/or time. • If an object is self-similar, its parts, when magnified, resemble the shape of the whole. • In case of stochastic objects like time-series, self-similarity is used in the distributional sense

Intuition of Self-Similarity • Something “feels the same” regardless of scale (also called fractals)

Self-Similarity in Traffic Measurement(Ⅰ) Traffic Measurement

Pictorial View of Self-Similarity

The Famous Data • Leland and Wilson collected hundreds of millions of Ethernet packets without loss and with recorded time-stamps accurate to within 100µs. • Data collected from several Ethernet LAN’s at the Bellcore Morristown Research and Engineering Center at different times over the course of approximately 4 years.

Why is Self-Similarity Important? • Recently, network packet traffic has been identified as being self-similar. • Current network traffic modeling using Poisson distributing (etc.) does not take into account the self-similar nature of traffic. • This leads to inaccurate modeling which, when applied to a huge network like the Internet, can lead to huge financial losses.

Problems with Current Models • Current modeling shows that as the number of sources (Ethernet users) increases, the traffic becomes smoother and smoother • Analysis shows that the traffic tends to become less smooth and more bursty as the number of active sources increases

Consequences of Self-Similarity • Traffic has similar statistical properties at a range of timescales: ms, secs, mins, hrs, days • Merging of traffic (as in a statistical multiplexer) does not result in smoothing of traffic Aggregation Bursty Data Streams Bursty Aggregate Streams

Problems with Current Models Cont.’d • Were traffic to follow a Poisson or Markovian arrival process, it would have a characteristic burst length which would tend to be smoothed by averaging over a long enough time scale. • Rather, measurements of real traffic indicate that significant traffic variance (burstiness) is present on a wide range of time scales

Pictorial View of Current Modeling

Side-by-side View

Definitions and Properties • Long-range Dependence • autocorrelation decays slowly • Hurst Parameter • Developed by Harold Hurst (1965) • H is a measure of “burstiness” • also considered a measure of self-similarity • 0 < H < 1 • H increases as traffic increases

Definitions and Properties Cont.’d • low, medium, and high traffic hours • as traffic increases, the Hurst parameter increases • i.e., traffic becomes more self-similar

Self-Similarity in Traffic Measurement(Ⅱ) Network Traffic

Properties of Self Similarity • X = (Xt : t = 0, 1, 2, ….) is covariance stationary random process (i.e. Cov(Xt,Xt+k) does not depend on t for all k) • Let X(m)={Xk(m)} denote the new process obtained by averaging the original series X in non-overlapping sub-blocks of size m. • E.g. X(1)= 4,12,34,2,-6,18,21,35Then X(2)=8,18,6,28X(4)=13,17 • Mean , variance 2Autocorrelation Function r(k) ~ k-b, where 0 < b < 1. • Suppose that r(k)  k-β, 0<β<1

Auto-correlation Definition • X is exactly second-orderself-similar if • The aggregated processes have the same autocorrelation structure as X. i.e. • r (m) (k) = r(k), k0 for all m =1,2, … • X is [asymptotically] second-orderself-similar ifthe above holds when [ r (m) (k)  r(k), m  ] • Most striking feature of self-similarity: Correlation structures of the aggregated process do not degenerate as m  

Traditional Models • This is in contrast to traditional models • Correlation structures of their aggregated processes degenerate as m   i.e. r (m) (k)  0 as m , for k = 1,2,3,... • Example: • Poisson Distribution • Self-Similar Distribution

Long Range Dependence • Processes with Long Range Dependence are characterized by an autocorrelation function that decays hyperbolically as k increases • Important Property: This is also called non-summability of correlation • The intuition behind long-range dependence: • While high-lag correlations are all individually small, their cumulative affect is important • Gives rise to features drastically different from conventional short-range dependent processes

Intuition • Short-range processes: • Exponential Decay of autocorrelations , i.e.: • r(k) ~ pk , as k , 0 < p < 1 • Summation is finite • Non-summability is an important property • Guarantees non-degenerate correlation structure of the aggregated processes X(m) as m  

The Measure of Self-Similarity • Hurst Parameter H , 0.5 < H < 1 • Three approaches to estimate H (Based on properties of self-similar processes) • Variance Analysis of aggregated processes • Analysis of Rescaled Range (R/S) statistic for different block sizes • A Whittle Estimator

Variance Analysis • Variance of aggregated processes decays as: • Var(X(m)) = am-b as m inf, • For short range dependent processes (e.g. Poisson Process), • Var(X(m)) = am-1 as m inf, • Plot Var(X(m)) against m on a log-log plot • Slope > -1 indicative of self-similarity

The R/S statistic For a given set of observations, Rescaled Adjusted Range or R/S statistic is given by where

Example • Xk = 14,1,3,5,10,3 • Mean = 36/6 = 6 • W1 =14-(1.6 )=8 • W2 =15-(2.6 )=3 • W3 =18-(3.6 )=0 • W4 =23-(4.6 )=-1 • W5 =33-(5.6 )=3 • W6 =36-(6.6 )=0 R/S = 1/S*[8-(-1)] = 9/S

The Hurst Effect • For self-similar data, rescaled range or R/S statistic grows according to cnH • H = Hurst Paramater, > 0.5 • For short-range processes , • R/S statistic ~ dn0.5 • History: The Nile river • In the 1940-50’s, Harold Edwin Hurst studies the 800-year record of flooding along the Nile river. • (yearly minimum water level) • Finds long-range dependence.

Whittle Estimator • Provides a confidence interval • Property: Any long range dependent process approaches FGN, when aggregated to a certain level • Test the aggregated observations to ensure that it has converged to the normal distribution

Self Similarity • X is exactly second-orderself-similar with Hurst parameter H (= 1- β/2) if for all m, • Var(X(m) ) = 2 m-β , and • r (m) (k) = r(k), k0 • X is [asymptotically] second-orderself-similar ifthe above holds when [ r (m) (k)  r(k), m∞  ]

Section 2: Ethernet Traffic is Self-Similar

Plots Showing Self-Similarity (Ⅰ) H=1 H=0.5 H=0.5 Estimate H  0.8

Plots Showing Self-Similarity (Ⅱ) High Traffic 5.0%-30.7% Mid Traffic 3.4%-18.4% Low Traffic 1.3%-10.4% Higher Traffic, Higher H

H : A Function of Network Utilization • Observation shows “contrary to Poisson” • Network Utilization H • As we shall see shortly, H measures traffic burstiness As number of Ethernet users increases, the resulting aggregate traffic becomes burstier instead of smoother

Difference in low traffic H values • Pre-1990: host-to-host workgroup traffic • Post-1990: Router-to-router traffic • Low period router-to-router traffic consists mostly of machine-generated packets • Tend to form a smoother arrival stream, than low period host-to-host traffic

H : Measuring “Burstiness” • Intuitive explanation using M/G/ Model • As α 1, service time is more variable, easier to generate burst • Increasing H ! • Wrong way to measure “burstiness” of self-similar process • Peak-to-mean ratio • Coefficient of variation (for interarrival times)

Summary • Ethernet LAN traffic is statistically self-similar • H : the degree of self-similarity • H : a function of utilization • H : a measure of “burstiness” • Models like Poisson are not able to capture self-similarity

Discussions • How to explain self-similarity ? • Heavy tailed file sizes • How this would impact existing performance? • Limited effectiveness of buffering • Effectiveness of FEC • How to adapt to self-similarity? • Prediction • Adaptive FEC

Thanks !

Interesting Links