470 likes | 523 Views
Explore the self-similar nature of Ethernet traffic through rigorous statistical analysis based on high-quality data measurements. Learn about self-similar processes and implications for network engineering. Dive into autocorrelation functions and long-range dependence.
E N D
On the Self-Similar Nature of Ethernet Traffic Will E. Lealand, Murad S. Taqqu, et al IEEE/ACM Transactions on Networking Vol. 2, No.1, Feb.1994 Presented by Shaun Chang
Outline • Introduction • Traffic Measurements • Self-Similar Stochastic Processes • Ethernet Traffic Is Self-Similar • Engineering for Self-Similar Network Traffic • Discussion
Introduction • Leland and Wilson collected hundreds of millions of Ethernet packets without loss • with recorded time-stamps accurate to within 100 μs • between August 1989 and February 1992 • on several Ethernet LAN’s at the Bellcore Morristown Research and Engineering Center.
Introduction (cont’d) • The main objective of this paper is • to establish the self-similarity characteristic • of the very high quality, high time-resolution Ethernet LAN traffic measurements presented in [14] • in a statistically rigorous manner.
“Self-similar” • “Self-similar” processes was brought to the attention of statisticians by Mandelbrot and his co-workers, mainly through applications in hydrology and geophysics. ([21]-[23])
Outline • Introduction • Traffic Measurements • Self-Similar Stochastic Processes • Ethernet Traffic Is Self-Similar • Engineering for Self-Similar Network Traffic • Discussion
The Traffic Monitor • Wilson built the monitoring system to collect the data. • For each packet, the monitor records • a timestamp accurate to within 100μs-20 μs, • the packet length, • the status of the Ethernet interface • and the first 60 bytes of data (header information).
The Network Environment in Bellcore • A research or software development environment. • Workstations are the primary machines. • Four sets of traffic measurements, each representing between 20 and 40 consecutive hours of Ethernet traffic. (August 1989, October 1989, January 1990, February 1992.)
Workgroup Network Traffic Data • Host-to-host traffic
Workgroup and External Trafic • Router-to-router traffic
Outline • Introduction • Traffic Measurements • Self-Similar Stochastic Processes • Ethernet Traffic Is Self-Similar • Engineering for Self-Similar Network Traffic • Discussion
Self-Similarity in Traffic Measurement(Ⅰ) Traffic Measurement
Definitions of Self-Similarity • X = {Xt : t = 0, 1, 2, …. } is covariance stationary process (i.e. Cov(Xt,Xt+τ) does not depend on t for all τ) • Mean , variance 2 • Suppose that r(k) k-β, 0<β<1, as k∞ • X(m)={Xk(m)} where elements are average over non-overlapping blocks of size m
Autocorrelation function • 一個「時間數列」(Time series) 與其本身之過去簡單之線性相關;即數值xt之序列 (Sequence) 與τ單位時間後所出現之數值xt+τ相關,其時間位移τ稱為落後(Lag) 。 「自相關函數」(Autocorrelation function)即為落後變數之自相關。
Definitions of Self-Similarity • X is [exactly] second-orderself-similar with Hurst parameter H = 1- β/2 if for all m=1,2,3…. X(m), • Var(X(m) ) = 2 m -β , and • r (m) (k) = r(k), k0 • X is [asymptotically] second-orderself-similar with Hurst parameter H = 1- β/2 if for all k large enough • r (m) (k) r(k), as m∞
Properties of Self-Similarity • Var(X(m) ) (= 2 m-β ) decreases more slowly (than m –1) • r(k) decreases hyperbolically (not exponentially) so that kr(k) = (long range dependence) • The spectral density [discrete time Fourier Transform of r(k)] f(λ) cλ-(1- β), as λ0. i.e. f(.) obeys a power-law near the origin.
Discrete-Time Fourier Transform • 在連續時間下的系統輸入之信號多為類比的方式,但是在許多情形下的某些系統輸入函數卻呈現著離散的型態,例如每隔一段時間間隔才量取的實驗數據,每隔一段距離間隔才量化的影像資料,這時候我們就必須將原來的傅利葉轉換作視當的修正,從連續時間下的傅利葉轉換轉到離散時間傅利葉轉換 (Discrete-Time Fourier Transform;DTFT)時,其定義為:
Slowly Decaying Variance • The variance of the sample decreases more slowly than the reciprocal of the sample size • For most processes, the variance of a sample diminishes quite rapidly as the sample size is increased, and stabilizes soon • For self-similar processes, the variance decreases very slowly, even when the sample size grows quite large
Variance-Time Plot Slope flatter than -1 for self-similar process Variance Slope = -1 for most processes m
Long Range Dependence • Autocorrelation is a statistical measure of the relationship, if any, between a random variable and itself, at different time lags • Positive correlation: big observation usually followed by another big, or small by small • Negative correlation: big observation usually followed by small, or small by big • No correlation: observations unrelated
Long Range Dependence • Autocorrelation coefficient can range between +1 (very high positive correlation) and -1 (very high negative correlation) • Zero means no correlation • Autocorrelation function shows the value of the autocorrelation coefficient for different time lags k
Long Range Dependence • For most processes (e.g., Poisson, or compound Poisson), the autocorrelation function drops to zero very quickly (usually immediately, or exponentially fast) • For self-similar processes, the autocorrelation function drops very slowly (i.e., hyperbolically) toward zero, but may never reach zero
Non-Degenerate Autocorrelations • For self-similar processes, the autocorrelation function for the aggregated process is indistinguishable from that of the original process • If autocorrelation coefficients match for all lags k, then called exactly self-similar • If autocorrelation coefficients match only for large lags k, then called asymptotically self-similar
Autocorrelation Function +1 Typical long-range dependent process 0 Autocorrelation Coefficient Typical short-range dependent process -1 lag k 0 100
The Hurst Effect • For almost all naturally occurring time series, the rescaled adjusted range statistic (also called the R/S statistic) for sample size n obeys the relationship E[R(n)/S(n)] = cnH where: R(n) = max(0, W1 , ... Wn ) - min(0, W1 , ... Wn ) S2(n) is the sample variance, X(n) is the sample mean,and Wk = Xi - k X(n) for k = 1, 2, ... n k i =1
The Hurst Effect • For models with only short range dependence, H is almost always 0.5 • For self-similar processes, 0.5 < H < 1.0 • This discrepancy is called the Hurst Effect, and H is called the Hurst parameter • Single parameter to characterize self-similar processes
Formal Modeling of Self-Similarity • Fractional Gaussian noise (FGN) [22] • Gaussian process with mean , variance 2, and • Autocorrelation function r(k)=(|k+1|2H-|k|2H+|k-1|2H), k>0 • Exactly second-order self-similar with 0.5<H<1 • Fractional ARIMA(p,d,q) [3] • Asymptotically second-order self-similar with H=d+0.5 where 0<d<0.5
A Construction of Self-similar Process [19], [28] • Aggregating many simple renewal reward processes exhibiting inter-renewal times with infinite variances. • A sequence of i.i.d. integer valued random variables U0 ,U1 ,U2 ,U3 …(Inter renewal times) with heavy tail, i.e., with the property • P(U>u)~u-αh(u), as u ∞ , 1< α<2, h(u) is slowly varying at infinity • Renewal process Definition: A counting process N(t) with iid random variables {U1, U2, …} • Fractional Brownian motion [21],[22]
Inference for Self-Similar Processes • Time domain analysis based on R/S statistic • Plotting log(R(n)/S(n)) versus log(n) • Variance analysis based on the aggregated process X(m) • Reminds Var(X(m) ) = 2 m-β, plot log(Var(X(m) )) against log m
Inference for Self-Similar Processes • Frequency domain analysis (Periodogram-based) • Estimate PSD f(λ) using discrete time Fourier Transform • Reminds f(λ) cλ-(1- β), as λ0, plot log(f(λ)) against log λ • Provides confidence intervals when combining Whittle’s MLE approach and the aggregation method
Outline • Introduction • Traffic Measurements • Self-Similar Stochastic Processes • Ethernet Traffic Is Self-Similar • Engineering for Self-Similar Network Traffic • Discussion
Graphical Methods for Checking the Self-Similarity Property (Aug89.MB) H=1 H=0.5 H=0.5 Estimate H 0.8
Plots Showing Self-Similarity (Ⅱ) High Traffic 5.0%-30.7% Mid Traffic 3.4%-18.4% Low Traffic 1.3%-10.4% Higher Traffic, Higher H
Outline • Introduction • Traffic Measurements • Self-Similar Stochastic Processes • Ethernet Traffic Is Self-Similar • Engineering for Self-Similar Network Traffic • Discussion
On the Nature of Traffic Generated by Individual Ethernet Hosts • A simple renewal reward process is an adequate traffic source model for an individual Ethernet user. • When aggregating the traffic of many such source models, the resulting superposition process is a fractional Brownian motion with self-similarity parameter H=(3- α)/2 . • P(U>u)~u-αh(u), as u ∞ , 1< α<2
On Measuring “Burstiness” • Observation shows “contrary to Poisson” • H measures traffic burstiness • As number of Ethernet users increases, the resulting aggregate traffic becomes burstier instead of smoother
On Measuring “Burstiness” • As α 1, • service time is more variable, • easier to generate burst • H is higher! • H=(3- α)/2 and α characterize the “thickness” of the tail of the inter-renewal time distribution. • Wrong way to measure “burstiness” of self-similar process • Peak-to-mean ratio • Coefficient of variation (for interarrival times)
On Generating Synthetic Traces of Self-Similar Traffic • Discrete time M/G/input model • Service time X given by heavy tail distribution • Example : Pareto distribution P(X>k)~k-α, 1< α<2 • N = {Nt ,t=1,2,…} is self-similar with H=(3- α)/2 where Nt denotes # of members being serviced at time t
Outline • Introduction • Traffic Measurements • Self-Similar Stochastic Processes • Ethernet Traffic Is Self-Similar • Engineering for Self-Similar Network Traffic • Discussion
Summary • Ethernet LAN traffic is statistically self-similar • H : the degree of self-similarity • H : a function of utilization • H : a measure of “burstiness” • Models like Poisson are not able to capture self-similarity