160 likes | 308 Views
Self Similarity in World Wide Web: Traffic Evidence and Possible Causes. Mark E. Crovella and Azer Bestavros Computer Science Dept, Boston University. Presented by Kalyan Boggavarapu CSC 497 Lehigh University. Self-Similarity.
E N D
Self Similarity in World Wide Web: Traffic Evidence and Possible Causes Mark E. Crovella and Azer Bestavros Computer Science Dept, Boston University Presented by Kalyan Boggavarapu CSC 497 Lehigh University
Self-Similarity • Def: is an object whose appearance is unchanged regardless of the scale it is used. • Heavy tailed: • a function exhibiting the power laws. • E.g.: The geographical distribution of the people in the world. • World Wide Web traffic can show Self-Similarity Kalyan Boggavarapu CSC 497 Lehigh University
Data Set • Traces from NCSA Mosaic • Jan, Feb 1995 • Logs: URL, session, User and workstation ID • Experiment Environment: • 37 SparkStation-2 workstations, Kalyan Boggavarapu CSC 497 Lehigh University
Parameters • Degree of self Similarity - H • Hurst parameter H ,range of (1/2 , 1) • H->1 is the max self-similarity • In this paper we would see Kalyan Boggavarapu CSC 497 Lehigh University
Analysis in two stages • Stage 1: • what is the appropriate value of H. • Stage 2: • Which parameter accurately measures this parameter H. Kalyan Boggavarapu CSC 497 Lehigh University
Self Similarity for different time intervals • Step 1: • Estimate for short intervals ( 1 sec and above ) • using: web traffic data for a single hr • Plot: • Variance Time plot, • Rescaled range plot • Periodogram plot • Step 2: • Estimate for scaling to large intervals • Whittle Estimator Kalyan Boggavarapu CSC 497 Lehigh University
Self Similarity characteristics graphs 1 Slope => H This line is => H Slope is => H Kalyan Boggavarapu CSC 497 Lehigh University
Whilttle Estimator • Estimates: the confidence range of H • Based: a time series • FGN – Fractional Gaussian Noise Model • Now check: if timeseries aggregation or • Estimated H is consistent or not ? • Infer: www traffic at stub networks is self similar when traffic is high in demand. Kalyan Boggavarapu CSC 497 Lehigh University
Expected feature: aggregation => H Aggregation over a long range shows stability of the hypothesis Whittle estimator confirms our earlier calculations of H H Fully busy Variance of 95% Confidence Interval of H Least busy H decreasing as it becomes less busy Kalyan Boggavarapu CSC 497 Lehigh University
Stage 2:Which parameter is useful to estimate the value of H
Which parameter is responsible for self similarity? File requests => file transfers => unique files distribution Alpha = 1.2 H (.7-.8) Kalyan Boggavarapu CSC 497 Lehigh University
Its Available files Available files => Heavy tailed behavior of file transfer Conclusion: Distribution of available files => ( Web traffic self similarity = Heavy tailed distribution of file transfers) Kalyan Boggavarapu CSC 497 Lehigh University
Sources: • “Self-Similarity in World Wide Web Traffic: Evidence and Possible Causes” (1996) Mark Crovella, Azer Bestavros Proceedings of SIGMETRICS'96: The ACM International Conference on Measurement and Modeling of Computer Systems. Kalyan Boggavarapu CSC 497 Lehigh University