Explore the concept of self-similarity in Ethernet traffic behavior, implications for network modeling, and statistical properties of self-similar processes. Learn about Hurst parameter, long-range dependence, and methods to estimate self-similarity.
On the Self-Similar Nature of Ethernet TrafficWill E. Leland, Walter Willinger and Daniel V. Wilson BELLCOREMurad S. Taqqu BU
Overview • What is Self Similarity? • Ethernet Traffic is Self-Similar • Source of Self Similarity • Implications of Self Similarity
Intuition of Self-Similarity • Something “feels the same” regardless of scale
Stochastic Objects In case of stochastic objects like time-series, self-similarity is used in the distributional sense
Why is Self-Similarity Important? • Recently, network packet traffic has been identified as being self-similar. • Current network traffic modeling using Poisson distributing (etc.) does not take into account the self-similar nature of traffic. • This leads to inaccurate modeling of network traffic.
Problems with Current Models • A Poisson process • When observed on a fine time scale will appear bursty • When aggregated on a coarse time scale will flatten (smooth) to white noise • A Self-Similar (fractal) process • When aggregated over wide range of time scales will maintain its bursty characteristic
Consequences of Self-Similarity • Traffic has similar statistical properties at a range of timescales: ms, secs, mins, hrs, days • Merging of traffic (as in a statistical multiplexer) does not result in smoothing of traffic Aggregation Bursty Data Streams Bursty Aggregate Streams
Definitions and Properties • Long-range Dependence • autocorrelation decays slowly • Hurst Parameter • Developed by Harold Hurst (1965) • H is a measure of “burstiness” • also considered a measure of self-similarity • 0 < H < 1 • H increases as traffic increases
Definitions and Properties Cont.’d • low, medium, and high traffic hours • as traffic increases, the Hurst parameter increases • i.e., traffic becomes more self-similar
Properties of Self Similarity • X = (Xt : t = 0, 1, 2, ….) is covariance stationary random process (i.e. Cov(Xt,Xt+k) does not depend on t for all k) • Let X(m)={Xk(m)} denote the new process obtained by averaging the original series X in non-overlapping sub-blocks of size m. • Mean , variance 2 • Suppose that Autocorrelation Functionr(k) k-β, 0<β<1 E.g. X(1)= 4,12,34,2,-6,18,21,35Then X(2)=8,18,6,28X(4)=13,17
Auto-correlation Definition • X is exactly second-orderself-similar if • The aggregated processes have the same autocorrelation structure as X. i.e. • r (m) (k) = r(k), k0 for all m =1,2, … • X is [asymptotically] second-orderself-similar ifthe above holds when [ r (m) (k) r(k), m ] • Most striking feature of self-similarity: Correlation structures of the aggregated process do not degenerate as m
Traditional Models • This is in contrast to traditional models • Correlation structures of their aggregated processes degenerate as m i.e. r (m) (k) 0 as m , for k = 1,2,3,... • Example: • Poisson Distribution • Self-Similar Distribution
Long Range Dependence • Processes with Long Range Dependence are characterized by an autocorrelation function that decays hyperbolically as k increases • Important Property: This is also called non-summability of correlation
Intuition • Short-range processes: • Exponential Decay of autocorrelations , i.e.: • r(k) ~ pk , as k , 0 < p < 1 • Summation is finite • The intuition behind long-range dependence: • While high-lag correlations are all individually small, their cumulative affect is important • Gives rise to features drastically different from conventional short-range dependent processes
The Measure of Self-Similarity • Hurst Parameter H , 0.5 < H < 1 • Three approaches to estimate H (Based on properties of self-similar processes) • Variance Analysis of aggregated processes • Analysis of Rescaled Range (R/S) statistic for different block sizes • A Whittle Estimator
Variance Analysis • Variance of aggregated processes decays as: • Var(X(m)) = am-b as m inf, • For short range dependent processes (e.g. Poisson Process), • Var(X(m)) = am-1 as m inf, • Plot Var(X(m)) against m on a log-log plot • Slope > -1 indicative of self-similarity
The R/S statistic For a given set of observations, Rescaled Adjusted Range or R/S statistic is given by where
Example • Xk = 14,1,3,5,10,3 • Mean = 36/6 = 6 • W1 =14-(1*6 )=8 • W2 =15-(2*6 )=3 • W3 =18-(3*6 )=0 • W4 =23-(4*6 )=-1 • W5 =33-(5*6 )=3 • W6 =36-(6*6 )=0 R/S = 1/S*[8-(-1)] = 9/S
The Hurst Effect • For self-similar data, rescaled range or R/S statistic grows according to cnH • H = Hurst Paramater, > 0.5 • For short-range processes , • R/S statistic ~ dn0.5 • History: The Nile river • In the 1940-50’s, Harold Edwin Hurst studies the 800-year record of flooding along the Nile river. • (yearly minimum water level) • Finds long-range dependence.
Whittle Estimator • Provides a confidence interval • Property: Any long range dependent process approaches FGN, when aggregated to a certain level • Test the aggregated observations to ensure that it has converged to the normal distribution
Recap • Self-similarity manifests itself in several equivalent fashions: • Non-degenerate autocorrelations • Slowly decaying variance • Long range dependence • Hurst effect
The Famous Data • Leland and Wilson collected hundreds of millions of Ethernet packets without loss and with recorded time-stamps accurate to within 100µs. • Data collected from several Ethernet LAN’s at the Bellcore Morristown Research and Engineering Center at different times over the course of approximately 4 years.
Plots Showing Self-Similarity (Ⅰ) H=1 H=0.5 H=0.5 Estimate H 0.8
Plots Showing Self-Similarity (Ⅱ) High Traffic 5.0%-30.7% Mid Traffic 3.4%-18.4% Low Traffic 1.3%-10.4% Higher Traffic, Higher H
H : A Function of Network Utilization • Observation shows “contrary to Poisson” • Network Utilization H • As we shall see shortly, H measures traffic burstiness As number of Ethernet users increases, the resulting aggregate traffic becomes burstier instead of smoother
Difference in low traffic H values • Pre-1990: host-to-host workgroup traffic • Post-1990: Router-to-router traffic • Low period router-to-router traffic consists mostly of machine-generated packets • Tend to form a smoother arrival stream, than low period host-to-host traffic
Summary • Ethernet LAN traffic is statistically self-similar • H : the degree of self-similarity • H : a function of utilization • H : a measure of “burstiness” • Models like Poisson are not able to capture self-similarity
Discussions • How to explain self-similarity ? • Heavy tailed file sizes • How this would impact existing performance? • Limited effectiveness of buffering • Effectiveness of FEC
Introduction • The superposition of many ON/OFF sources whose ON-periods and OFF-periods exhibit the Noah Effect produces aggregate network traffic that features the Joseph Effect. • Noah Effect: • high variability or infinite variance Joseph Effect: Self-similar or long-range dependent traffic Also known as packet train models
The Noah Effect • Noah Effect is the essential point of departure from traditional to self-similar traffic modeling • Results in highly variable ON-OFF periods : Train length and inter-train distances can be very large with non-negligible probabilities • Infinite Variance Syndrome : Many naturally occurring phenomenon can be well described with infinite variance distributions • Heavy-tail distributions, parameter
Existing Models • Traditional traffic models: finite variance ON/OFF source models • Superposition of such sourcesbehaves like white noise, with only short range correlations
Idealized ON/OFF Model • Lengths of ON- and OFF periods are iid positive random variables, Uk • Suppose that U has a hyperbolic tail distribution, • Property (1) is the infinite variance syndrome or the Noah Effect. • 2 implies E(U2) = • > 1 ensures that E(U) < , and that S0 is not infinite
Explaining Self-Similarity • Consider a set of processes which are either ON or OFF • The distribution of ON and OFF times are heavy tailed (a1, a2) • The aggregation of these processes leads to a self-similar process • H = (3 - min (a1, a2))/2 • So, how do we get heavy tailed ON or OFF times?