1 / 29

Network Traffic Modeling

Network Traffic Modeling. Punit Shah (pshah@cse.ogi.edu) CSE581 Internet Technologies OGI, OHSU 2002, March 6. Papers. Generating Representative Web Workloads for Networks and Server performance Evaluation Paul Bardford, Mark Crovells. Comp Sci Department, Boston University.

ivana
Download Presentation

Network Traffic Modeling

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Network Traffic Modeling Punit Shah (pshah@cse.ogi.edu) CSE581 Internet Technologies OGI, OHSU 2002, March 6

  2. Papers • Generating Representative Web Workloads for Networks and Server performance Evaluation • Paul Bardford, Mark Crovells. Comp Sci Department, Boston University. • Self-Similarity in WWW traffic: Evidence and possible cause • Mark Crovells, Azer Bestavros. Comp Sci Department, Boston University. • On the Self-Similar Nature of Ethernet Traffic • Will Leland et al. IEEE members. Funded by Boston University. CSE581, Winter 2002 | Punit Shah (pshah@cse.ogi.edu)

  3. Traffic modeling Understand a nature of the network traffic • Establish a traffic pattern • Characteristics, metrics varies by the network stack layer CSE581, Winter 2002 | Punit Shah (pshah@cse.ogi.edu)

  4. Why to model a traffic ? • Understand behavior of the servers, network etc. in workload conditions. • Capacity management • infrastructure planning • Performance improvement • Design of the software and services • Testing and Validation • Developing a simulators (work load generators), e.g. ns (CMU), SURGE, SpecWeb96 and many commercially available simulators. CSE581, Winter 2002 | Punit Shah (pshah@cse.ogi.edu)

  5. Model Parameters • Application layer (HTTP) • server file size distribution • request size distribution (file size + protocol headers) • temporal locality (caching) etc. • Data Link layer (Ethernet) • packets per second • mean time between two consecutive packets • bandwidth utilization • effect of number of hosts etc. CSE581, Winter 2002 | Punit Shah (pshah@cse.ogi.edu)

  6. Time Series Analysis Primer • Correlation • Under similar circumstances if any two events exhibits an identical(opposite) pattern, then events are called positively(negatively) correlated. • Range for degree of correlation is [-1, 1]. • Correlation models. • Long range dependence • Current event is positively correlated to the future event. • Heavy tail • Non-negligible random distribution in the tail, e.g. hyperbolic CDF plot. Simplest distribution is Pareto. p(x) ~ x-; 0<  < 2 CSE581, Winter 2002 | Punit Shah (pshah@cse.ogi.edu)

  7. lim r(k) = k(-), 0 <  < 1 k autocorrelation function Self-Similarity Term introduced by Mandelbrot in 1965. Let X = (Xt: t = 0, 1, 2, ….) be a time series mean  and variance 2 For each m = 1, 2, 3 … X(m) = (Xk(m): k = 1, 2 …m) is new time series, i.e. original series is divided into m non-overlapping segments, whose autocorrelation function is r(m)(k). If r(m)(k) = r(k), then X is called (asymptotically) second order self-similar with degree H = 1 - /2. Where Xk(m) = (Xkm-m+1 + … + Xkm)/m Also by kr(k) = , self-similar means long-range dependence. CSE581, Winter 2002 | Punit Shah (pshah@cse.ogi.edu)

  8. 81 17 6 99 25 21 45 4 20 18 56 7 21 82 11 8 65 34 9 20 Self-similar CSE581, Winter 2002 | Punit Shah (pshah@cse.ogi.edu)

  9. Self-similar 81 17 6 99 25 21 45 4 20 18 56 7 21 82 11 8 65 34 9 20 Xi = 228 108 177 136 i=1,m ‘Self-Similarity’ == Burstiness CSE581, Winter 2002 | Punit Shah (pshah@cse.ogi.edu)

  10. Ethernet Traffic Data Collection • Data collected over four years, Aug 1989 to Feb 1992 to account for various network topologies. • Main traffic at the time (1994) rlogin,e-mail, NFS, local radio station audio. • Hosts 140 - 1200. ~27M packets. • An instance of data collection encompassed low, medium, busy hours. • Timestamp with 20s accuracy. CSE581, Winter 2002 | Punit Shah (pshah@cse.ogi.edu)

  11. Packets/unit time (empirical) CSE581, Winter 2002 | Punit Shah (pshah@cse.ogi.edu)

  12. Packets/unit time (synthetic) CSE581, Winter 2002 | Punit Shah (pshah@cse.ogi.edu)

  13. Statistical tests for self-similarity • Variance-time plot • variance of log(X(m)) is plotted against log(m); straight line with slope - > -1; H = 1 - /2 • R/S plot (rescaled adjusted range stats.) • plot grows according to power law with exponent H as a n, i.e. nH • periodogram • slope of the power spectrum of the series CSE581, Winter 2002 | Punit Shah (pshah@cse.ogi.edu)

  14. Ethernet Variance Time plot • Increasing m, slowly decreasing variance. • Curve will cross threshold-line, if not self-similar. CSE581, Winter 2002 | Punit Shah (pshah@cse.ogi.edu)

  15. Ethernet Traffic Analysis • Ethernet traffic is self-similar. • Unlike common belief, during busy times degree if self-similarity (burstiness) increases. • >>50% traffic TCP packets, but no apparent effect of the non-TCP packets. CSE581, Winter 2002 | Punit Shah (pshah@cse.ogi.edu)

  16. Web Traffic Data collection • Traces collected from the real users accessing the web documents (Nov 94 - May 95) using HTTP v0.9 and 1.0 (No parallel connections) • 4700 sessions • 591 users • 575,775 URL requests (46,830 unique per session) • 130,140 files transferred • Each file request is logged • URL • session, user, workstation ID • timestamp • size of doc, file transfer time CSE581, Winter 2002 | Punit Shah (pshah@cse.ogi.edu)

  17. Trace Analysis Web traffic is self-similar CSE581, Winter 2002 | Punit Shah (pshah@cse.ogi.edu)

  18. Reasons for the self-similarity • Web transmission times • Distribution is highly variable. • Available files are heavy-tailed. • Multi-media files to be blamed (image, audio, video) • Quite time • Active off and inactive off CSE581, Winter 2002 | Punit Shah (pshah@cse.ogi.edu)

  19. Quite Times CSE581, Winter 2002 | Punit Shah (pshah@cse.ogi.edu)

  20. Quite Time Distribution CSE581, Winter 2002 | Punit Shah (pshah@cse.ogi.edu)

  21. Generating Web Workload SURGE • User Equivalence (UE) • Synthesized behavior should emulate the users • Multi-threaded program. HTTP v1.0. No parallel connections • Distribution models • File sizes • Request sizes • File size + Protocol Headers • zero, if already cached • Popularity • Zipf’s law: if files are ordered in decreasing popularity, then reference to a file is inversely proportional to its rank. P 1/r • Empirical data shows the popular web-docs are extremely popular and others receive a few hits CSE581, Winter 2002 | Punit Shah (pshah@cse.ogi.edu)

  22. Model Parameters (contd.) • Embedded object count • Determines a quite time, specifically ‘active off’ • Temporal Locality (Caching) • Probability that same object would be requested again • Effect on network access • Stack distance • OFF Times • Important parameter, self-similarity is lost if OFF times are ignored Matching problem: Assign the popularity to each file for given distribution of the file size and empirical request size (count?) distribution CSE581, Winter 2002 | Punit Shah (pshah@cse.ogi.edu)

  23. SURGE Approach Use different (well known) models for each of the model parameter CSE581, Winter 2002 | Punit Shah (pshah@cse.ogi.edu)

  24. SURGE Validation • Compared with SpecWeb96 (specbench.org) • #of HTTP requests per second (h) • #of threads (t), per thread h/t requests • Packets/sec - baseline • tests for 70,300, 500 packets/sec CSE581, Winter 2002 | Punit Shah (pshah@cse.ogi.edu)

  25. Results • Roughly similar #of TCP packets and requests in 30min run • Mean active TCP connection is 0.028 v/s 13.9 for SURGE, with very high variance of 3.92 (0.18) indicating self-similarity • Server CPU utilization, active TCP connections are quite higher then the SepcWeb96 CSE581, Winter 2002 | Punit Shah (pshah@cse.ogi.edu)

  26. Active TCP Connections SpecWeb96 SURGE PPS 70 300 500 CSE581, Winter 2002 | Punit Shah (pshah@cse.ogi.edu)

  27. CPU Utilization SpecWeb96 SURGE PPS 70 300 500 CSE581, Winter 2002 | Punit Shah (pshah@cse.ogi.edu)

  28. Self-Similarity SpecWeb96 SURGE PPS 70 300 500 CSE581, Winter 2002 | Punit Shah (pshah@cse.ogi.edu)

  29. Conclusion • Self-similarity (burstiness) is integral part of the network traffic behavior. • Degree of self-similarity increases with the load. • Server and network load is radically different than the non-self-similar models. • Nature of the congestion produced by the self-similar traffic is drastically different from the non self-similar traffic. CSE581, Winter 2002 | Punit Shah (pshah@cse.ogi.edu)

More Related