1 / 23

CS 268: Lecture 14 Internet Measurements

CS 268: Lecture 14 Internet Measurements. Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer Sciences University of California, Berkeley Berkeley, CA 94720-1776. Project Presentations. Five slides Five minutes Five questions

zalika
Download Presentation

CS 268: Lecture 14 Internet Measurements

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CS 268: Lecture 14Internet Measurements Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer Sciences University of California, Berkeley Berkeley, CA 94720-1776

  2. Project Presentations • Five slides • Five minutes • Five questions • You’re outta there!

  3. 1st slide: Title • 2nd slide: motivations and problem formulation • Why is the problem important? • What is challenging/hard about your problem • 3rd slide: main idea of your solution • 4th slide: status • 5th slide: future plans and schedule

  4. Poll • How many people know: • Lamport clocks • Timestamp vectors • Byzantine agreement • Epidemic/gossip dissemination algorithms • Consistency • Bayou

  5. Internet Measurements • This field went from casual sideline to serious science with Vern Paxson’s thesis in 1997 • Now there is a massive literature in Internet measurements • Two devoted conferences (PAM and IMC) in addition to standard networking conferences

  6. Today’s Lecture • Provide a quick overview of the kinds of techniques used and questions addressed • Tom Anderson (UW) will lead us in a design exercise

  7. Philosophical Question • Why do we take Internet measurements?

  8. Measurement Techniques • Passive: • Traces • Netflow data • Network telescopes • BGP looking glass sites • …… • Active: • Ping • Traceroutes • ……

  9. Measurement Infrastructure • NIMI • Planetlab • Telescopes (CIED) • Traceroute servers • …..

  10. Varieties of Measurement Studies • Basic characterization from direct measurements • Raw statistics • Model fitting • Just because it is “direct” doesn’t mean it is easy! • Look at the care taken by Paxson in the two readings • Deep inference • Estimating something that can’t be directly measured

  11. Examples of Direct Studies • Packet sizes and protocol type • Flow sizes • Route stability • DNS usage • Stationarity • …….

  12. A Few Results • Most flows are small, but most bytes in large flows • Most flows are slow, but most bytes are in fast flows • Rate/size highly correlated (not due to slow start) • Losses are reasonably modeled as Poisson arrival of loss events followed by geometric loss train • Internet routing does not yield good paths…. In 2003, • Bytes: TCP 83% UDP 16% • Packets: TCP 75% UDP 22% • Flows: TCP 56% UDP 33%

  13. Traffic Variances • A Bellcore team discovered that ethernet traffic had strage behavior • No matter what time scale they averaged it over, the variances were still large! • How could that be?

  14. Self Similarity, LRD, and All That…. • Correlation: r(t) = < x(s+t)x(s)> - <x(s)>2 • Short-range correlation: Sum r(t) finite • Long-range correlation (LRD): Sum r(t) infinite • Example of LRD: r(t) ~ t-.5

  15. Aggregates and Variance • Consider a sum of M iid random variables: • Variance of sums ~ 1/M • Consider process with short-range correlations: • On large-enough time scales, intervals are iid • Variance should decay as 1/M • Only with LRD can the variance decay more slowly • Often as a power law with p<1

  16. LRD Comes from Many Sources? • Poisson arrivals, with long-tailed sizes is one mechanism • Most files small • Most bytes in large files • Pareto distributions are very common!

  17. Zipf and Pareto • Zipf: F(rank) ~ rank-b • F is frequency, size, etc. • 1 < b • Pareto: P(size) ~ size-a • 1 < a < 2 • Log(size) has geometric distribution! • Pareto distributed interarrival times means logs are uncorrelated • a = 1+1/b and b = 1/(a-1) • The tail of the distribution can’t be ignored!

  18. Examples of Zipf/Pareto • Income distribution • Asteroid sizes • Letter frequencies • File sizes • Telnet packet interarrivals • Web access frequencies (impact on caching) • AS connectivity

  19. Examples of Inference Tools • Bandwidth estimation • Critical path analysis • Network tomography • Flow rate causes • …..

  20. Bandwidth Estimation (I) • Send pair of packets back-to-back • If not intermediate packets intervene, their spacing is a function of the bandwidth of the bottleneck link • Send a series of packet-pairs, measure the minimal delays • Doesn’t work well: often overestimate capacity! • Can do better with many packets back-to-back

  21. Bandwidth Estimation (II) • Send variable sized packets, k hops (using TTL). • Delay at link i: Di = Ai + P/Ci • Ai is size-independent delay • P is packet size • Ci is capacity of link • Model delay of k hops: Sum (i = 1 to k) Di

  22. Possible Causes of Flow Rates • Application: doesn’t generate data fast enough • Opportunity: never leaves slow-start • Receiver: limited dby receive window • Sender: limited by sending buffer • Bandwidth: using full bottleneck bandwidth • Congestion: flow is responding to packet loss • Transport: none of the above…

  23. Differentiating Causes • Take trace of flow • Estimate RTT • Look at window epochs • Identify where TCP is in cycle

More Related