1 / 29

Characterizing and Predicting TCP Throughput on the Wide Area Network

Characterizing and Predicting TCP Throughput on the Wide Area Network. Dong Lu, Yi Qiao, Peter Dinda , Fabian Bustamante Department of Computer Science Northwestern University http://plab.cs.northwestern.edu. Overview. Algorithm for predicting the TCP throughput as function of flow size

rigg
Download Presentation

Characterizing and Predicting TCP Throughput on the Wide Area Network

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Characterizing and Predicting TCP Throughput on the Wide Area Network Dong Lu,Yi Qiao, Peter Dinda, Fabian Bustamante Department of Computer Science Northwestern University http://plab.cs.northwestern.edu

  2. Overview • Algorithm for predicting the TCP throughput as function of flow size • Minimal active probing • Dynamic probe rate adjustment • Explaining flow size / throughput correlation • Explaining why simple active probing fails Large scale empirical study

  3. Outline • Why TCP throughput prediction? • Particulars of study • Flow size / TCP throughput correlation • Issues with simple benchmarking • DualPats algorithm • Stability and dynamic rate adjustment

  4. Goal A library call BW = PredictTransfer(src,dst,numbytes); Expected Time = numbytes/BW; Ideally, we want a confidence interval: (BWLow,BWHigh) = PredictTransfer(src,dst,numbytes,p);

  5. Available Bandwidth • Maximum rate a path can offer a flow without slowing other flows • pathchar, cprobe, nettimer, delphi, IGI, pathchirp, pathload … • Available bandwidth can differ significantly from TCP throughput • Not real time, takes at least tens of seconds to run

  6. Simple TCP Benchmarking • Benchmark paths with a single small probe • BW = ProbeSize/Time • Widely used Network Weather Service (NWS) and others (Remos benchmarking collector) • Not accurate for large transfers on the current high speed Internet • Numerous papers show this and attempt to fix it

  7. Fixing Simple TCP Benchmarking • Logs [Sundharshan]: correlate real transfer measurements with benchmarking measurements • Recent transfers needed • Similar size transfers needed • Measurements at application chosen times • CDF-matching [Swany]: correlate CDF of real transfer measurements with CDF of benchmarking measurements • Recent transfers still needed • Measurements at application chosen times

  8. Analysis of TCP • Extensive research on TCP throughput modeling in networking community • Really intended to build better TCPs • Difficult to use models online because of hard to measure parameters • Future loss rate and RTT • Note: we measure goodput

  9. Our Measurement Study • PlanetLab and additional machines • Located all over the world • Measurements of throughput • Wide open socket buffers (1-3 MB) • Simple ttcp-like client/server • scp • GridFTP • Four separate sets of measurements

  10. Distribution Set • For analysis of TCP throughput stability and distributions • 60 randomly chosen paths among PlanetLab machines • 1.6 million transfers (client/server) • 100 KB, 200 KB, 400 KB, … 10 MB flows • 3000 consecutive transfers per path+flow size

  11. Correlation Set • For studying correlation between throughput and flow size, initial testing of algorithm • 60 randomly chosen paths among PlanetLab machines • 2.4 million transfers, 270 thousand runs, client/server • 100 KB, 200 KB, 400 KB, … 10 MB flows • Run = sweep flow size for path

  12. Verification Set • Test algorithm • 30 randomly chosen paths among PlanetLab machines and others • 4800 transfers, 300 runs, scp and GridFTP • 5 KB to 1 GB flows • Run = sweep flow size for path

  13. Online Evaluation Set • Test online algorithm • 50 randomly chosen paths among PlanetLab machines and others • 14000 transfers, scp and GridFTP • 40 MB or 160 MB file, randomly chosen size • 10 days

  14. Strong Correlation Between TCP Throughput and Flow Size Correlation and Verification Sets

  15. Why Does The Correlation Exist? • Slow start and user effects [Zhang] • Extant flows • Non-negligible startup overheads • Control messages in scp and GridFTP • Residual slow start effect • SACK results in slow convergence to equilibrium

  16. Why Simple Benchmarking Fails Need more than one probe to capture correlation Probes are too small

  17. Our Approach Two consecutive probes, both larger than the noise region

  18. Our Approach • Two consecutive probes are integrated into a single probe • 400KB, 800 KB in single 800 KB probe Probe two Probe one T2 0 T1

  19. Our Approach Flow size Transfer Time Solve For A and B Predict Throughput For Some Other Transfer

  20. Model Fit is Excellent Low and Normally Distributed Relative Errors At All Flow Sizes Correlation Set

  21. Stability • How long does the TCP throughput function remain stable? • How frequently should we probe the path? • What’s the distribution of throughput around the function (i.e., the error)?

  22. Throughput is Stable For Long Periods Increasing Max/Min Throughput in Interval Correlation Set

  23. Throughput Is Normally Distributed In An Interval Distribution Set

  24. Online DualPats Algorithm • Fetch probe sequence for destination • Start probing process if no data exists • Project probe sequence ahead • 20 point moving average over values with current sampling interval • Apply model using projected data • Return result • confidence interval computed using normality assumptions

  25. Dynamic Sampling Rate • Adjust sampling interval to correspond to the path’s stable intervals • Limit rate (20 to 1200 seconds) • Additive increase / additive decrease of based on difference between last two probes < 5% => increase interval > 15% => decrease interval

  26. Finding Sufficiently Large Probe Size • Default values: 400 KB / 800 KB • Upper bound • Additive increase until prediction error are less than threshold, all with same sign.

  27. Evaluation • Slight conservative bias • >90 % of predictions have < 35% error 1 P[mean error < X] Mean relative error Mean abs(relative error) -0.4 0.4 0 Relative error Online Evaluation Set

  28. Conclusions • Algorithm for predicting the TCP throughput as function of flow size • Minimal active probing • Dynamic probe rate adjustment • Explaining flow size / throughput correlation • Explaining why simple active probing fails Large scale empirical study

  29. For MoreInfo • Prescience Lab • http://plab.cs.northwestern.edu • Aqua Lab • http://aqualab.cs.northwestern.edu • D. Lu, Y. Qiao, P. Dinda, and F. Bustamante, Modeling and Taming Parallel TCP on the Wide Area Network, IPDPS 2005 . • Y. Qiao, J. Skicewicz, P. Dinda, An Empirical Study of the Multiscale Predictability of Network Traffic, HPDC 2004.

More Related