570 likes | 798 Views
On the Self-Similar Nature of Ethernet Traffic . Presented by: Feng Yan. Will E. Leland, Walter Willinger and Daniel V. Wilson BELLCORE Murad S. Taqqu Boston University. CS634 Advanced Computer Networking Computer Science College of William and Mary. Overview.
E N D
On the Self-Similar Nature of Ethernet Traffic Presented by: Feng Yan Will E. Leland, Walter Willinger and Daniel V. Wilson BELLCOREMurad S. Taqqu Boston University • CS634 Advanced Computer Networking • Computer Science • College of William and Mary
Overview • What is Self Similarity? • Ethernet Traffic is Self-Similar • Implications of Self Similarity • Conclusion • Discussion
Intuition of Self-Similarity • Something “feels the same” regardless of scale What is that???
Intuition of Self-Similarity • Something “feels the same” regardless of scale Self-similar in nature
Intuition of Self-Similarity • Something “feels the same” regardless of scale The Koch snowflake fractal
Intuition of Self-Similarity • Something “feels the same” regardless of scale The Koch snowflake fractal
Intuition of Self-Similarity • Something “feels the same” regardless of scale The Koch snowflake fractal
Intuition of Self-Similarity • Something “feels the same” regardless of scale
Intuition of Self-Similarity • Categories: • Exact self-similarity: Strongest Type • Approximate self-similarity: Loose Form • Statistical self-similarity: Weakest Type
Intuition of Self-Similarity Statistical self-similarity: Only numerical or statistical measures that are preserved across scales Approximate self-similarity: Recognisably similar but not exactly so. e.g. Mandelbrot set
Stochastic Objects In case of Stochastic Objects e.g. time-series Self-similarity is used in the distributional sense
Why Self-Similarity Important? • Recently, network packet traffic has been identified as being self-similar. • Current network traffic modeling using Poisson distributing (etc.) does not take into account the self-similar nature of traffic. • This leads to inaccurate modeling of network traffic.
Problems with Current Models • A Poisson process • When observed on a fine time scale will appear bursty • When aggregated on a coarse time scale will flatten (smooth) to white noise • A Self-Similar (fractal) process • When aggregated over wide range of time scales will maintain its bursty characteristic
Self-Similarity by picture Ethernet traffic August’89 trace packets per time unit
Consequences of Self-Similarity Reality (self-similar): Aggregation Bursty Data Streams Bursty Aggregate Streams Consequence: Inaccuracy Current Model: Aggregation Bursty Data Streams Smooth Pattern Streams
Mathematical Definitions • Long-range Dependence • autocorrelation decays slowly • Hurst Parameter • Developed by Harold Hurst (1965) • H is a measure of “burstiness” • also considered a measure of self-similarity • 0 < H < 1 • H increases as traffic increases • i.e., traffic becomes more self-similar
Properties of Self Similarity • X = (Xt : t = 0, 1, 2, ….) is covariance stationary random process (i.e. Cov(Xt,Xt+k) does not depend on t for all k) • Let X(m)={Xk(m)} denote the new process obtained by averaging the original series X in non-overlapping sub-blocks of size m. • Mean , variance 2 • Suppose that Autocorrelation Functionr(k) k -β, 0<β<1 e.g. X(1)= 4,12,34,2,-6,18,21,35Then X(2)=8,18,6,28X(4)=13,17
Definition by Auto-correlation • X is exactly second-order self-similar if • The aggregated processes have the same autocorrelation structure as X. i.e. • r (m) (k) = r(k), k0 for all m =1,2, … • X is asymptotically second-order self-similar ifthe above holds when[ r (m) (k) r(k), m ] • Most striking feature of self-similarity: Correlation structures of the aggregated process do not degenerate as m
Definition by Auto-correlation ACF lag
Traditional Models • Correlation structures of their aggregated processes degenerate as m i.e. r (m) (k) 0 as m , for k = 1,2,3,... • Short Range Dependence Processes: • Exponential Decay of autocorrelations • i.e. r(k) ~ pk , as k , 0 < p < 1 • Summation is finite
Long Range Dependence • Processes with Long Range Dependence are characterized by an autocorrelation function that decays hyperbolically as k increases • Important Property: This is also called non-summability of correlation
Intuition • The intuition behind long-range dependence: • While high-lag correlations are all individually small, their cumulative affect is important • Gives rise to features drastically different from conventional short-range dependent processes
The Measure of Self-Similarity ! • Hurst Parameter H , 0.5 < H < 1 • Three approaches to estimate H (Based on properties of self-similar processes) • Variance Analysis of aggregated processes • Rescaled Range (R/S) Analysis for different block sizes: time domain analysis • Periodogram Analysis: frequency domain analysis (Whittle Estimator)
Variance Analysis • Variance of aggregated processes decays as: • Var(X(m)) = am-b as m infinite, • For short range dependent processes (e.g. Poisson Process): • Var(X(m)) = am-1 as m infinite, • Plot Var(X(m)) against m on a log-log plot • Slope > -1 indicative of self-similarity
Variance Plot Example Slope=-0.7 Slope=-1
The R/S statistic For a given set of observations, Rescaled Adjusted Range or R/S statistic is given by where
Example • Xk = 14,1,3,5,10,3 • Mean = 36/6 = 6 • W1 =14-(1*6 )=8 • W2 =15-(2*6 )=3 • W3 =18-(3*6 )=0 • W4 =23-(4*6 )=-1 • W5 =33-(5*6 )=3 • W6 =36-(6*6 )=0 R/S = 1/S*[8-(-1)] = 9/S
The Hurst Effect • For self-similar data, rescaled range or R/S statistic grows according to cnH • H = Hurst Paramater, > 0.5 • For short-range processes , • R/S statistic ~ dn0.5 • History: The Nile river • In the 1940-50’s, Harold Edwin Hurst studied the 800-year record of flooding along the Nile river. • (yearly minimum water level) • Finds long-range dependence.
Pox plot example Slope = 1.0 Slope = 0.79 Slope = 0.5
Whittle Estimator • Provides a confidence interval • Property: Any long range dependent process approaches fractional Gaussian noise (FGN), when aggregated to a certain level • Test the aggregated observations to ensure that it has converged to the normal distribution
Summary • Self-similarity manifests itself in several equivalent fashions: • Non-degenerate autocorrelations • Slowly decaying variance • Long range dependence • Hurst effect
The Famous Data • Leland and Wilson collected hundreds of millions of Ethernet packets without loss and with recorded time-stamps accurate to within 100µs. • Data collected from several Ethernet LAN’s at the Bellcore Morristown Research and Engineering Center at different times over the course of approximately 4 years.
Plots Showing Self-Similarity (Ⅰ) H=1 H=0.5 H=0.5 Estimate H 0.8
Plots Showing Self-Similarity (Ⅱ) High Traffic 5.0%-30.7% Mid Traffic 3.4%-18.4% Low Traffic 1.3%-10.4% Packets Higher Traffic, Higher H
H : A Function of Network Utilization • Observation shows “contrary to Poisson” • Network Utilization H • As number of Ethernet users increases, the resulting aggregate traffic becomes burstier instead of smoother
Difference in low traffic H values • Pre-1990: host-to-host workgroup traffic • Post-1990: Router-to-router traffic • Low period router-to-router traffic consists mostly of machine-generated packets • Tend to form a smoother arrival stream, than low period host-to-host traffic
Summary • Ethernet LAN traffic is statistically self-similar • H : the degree of self-similarity • H : a function of utilization • H : a measure of “burstiness” • Models like Poisson are not able to capture self-similarity
Two Effects • The superposition of many ON/OFF sources whose ON-periods and OFF-periods exhibit the Noah Effect produces aggregate network traffic that features the Joseph Effect. • Noah Effect: high variability or infinitevariance Joseph Effect: Self-similar or long-range dependent traffic Also known as packet train models
Existing Models • Traditional traffic models: finite variance ON/OFF source models • Superposition of such sourcesbehaves like white noise, with only short range correlations
Easy Modeling: Noah Effect • Questions related to self-similarity can be reduced to practical implications of Noah Effect • Queuing and Network performance • Network Congestion Controls • Protocol Analysis
Queuing Performance • The Queue Length distribution • Traditional (Markovian) traffic: decreases exponentially fast • Self-similar traffic: decreases much more slowly • Not accounting for Joseph Effect can lead to overly optimistic performance Effect of H (Burstiness)