Bandwidth estimation in computer networks: measurement techniques & applications

Bandwidth estimation in computer networks:measurement techniques &applications Constantine Dovrolis College of Computing Georgia Institute of Technology

The Internet as a “black box” • Several network properties are important for applications and transport protocols: • Delay, loss rate, capacity, congestion, load, etc • But routers and switches do not provide direct feedback to end-hosts (except ICMP, also of limited use) • Mostly due to scalability, policy, and simplicity reasons

probe packets Can we guess what is in the black box? • End-systems can infer network state through end-to-end (e2e) measurements • Without any feedback from routers • Objectives:accuracy, speed, non-intrusiveness

Bandwidth estimation in packet networks • Bandwidth estimation (bwest) area: • Inference of various throughput-related metrics with end-to-end network measurements • Early works: • Keshav’s packet pair method for congestion control (‘89) • Bolot’s capacity measurements (’93) • Carter-Crovella’s bprobe and cprobe tools (’96) • Jacobson’s pathchar - per-hop capacity estimation (‘97) • Melander’s TOPP method for avail-bw estimation (’00) • In last 3-5 years: • Several fundamentally new estimation techniques • Many research papers at prestigious conferences • More than a dozen of new measurement tools • Several applications of bwest methods • Significant commercial interest in bwest technology

Overview • This talk is an overview of the most important developments in bwest area over the last 10 years • A personal bias could not be avoided.. • Overview • Bandwidth-related metrics • Capacity estimation • Packet pairs and CapProbe technique • Available bandwidth estimation • Iterative probing: Pathload, Pathvar and PathChirp • Comparison of tools • Applications • SOcket Buffer Auto-Sizing (SOBAS)

Network bandwidth metrics Ravi S.Prasad, Marg Murray, K.C. Claffy, Constantine Dovrolis, “Bandwidth Estimation: Metrics, Measurement Techniques, and Tools,” IEEE Network, November/December 2003.

Capacity definition • Maximum possible end-to-end throughput at IP layer • In the absence of any cross traffic • Achievable with maximum-sized packets • If Ci is capacity of link i, end-to-end capacity C defined as: • Capacity determined by narrow link

Available bandwidth definition • Per-hop average avail-bw: • Ai = Ci (1-ui) • ui: average utilization • A.k.a. residual capacity • End-to-end avg avail-bw A: • Determined by tight link • ISPs measure per-hop avail-bw passively (router counters, MRTG graphs)

The avail-bw as a random process • Instantaneous utilization ui(t): either 0 or 1 • Link utilization in (t, t+t) • Averaging timescale: t • Available bandwidth in (t, t+t) • End-to-end available bandwidth in (t, t+t)

Available bandwidth distribution • Avail-bw has significant variability • Need to estimate second-order moments, or even better, the avail-bw distribution • Variability depends on averaging timescale t • Larger timescale, lower variance • Distribution is Gaussian-like, if t >100-200 msec and with sufficient flow multiplexing

E2E Capacity estimation (a’):Packet pair technique Constantine Dovrolis, Parmesh Ramanathan, David Moore, “Packet Dispersion Techniques and Capacity Estimation,” In IEEE/ACM Transactions on Networking, Dec 2004.

Packet pair dispersion • Packet Pair (P-P) technique • Originally, due to Jacobson & Keshav • Send two equal-sized packets back-to-back • Packet size: L • Packet trx time at link i: L/Ci • P-P dispersion: time interval between last bit of two packets • Without any cross traffic, the dispersion at receiver is determined by narrow link:

Cross traffic interference • Cross traffic packets can affect P-P dispersion • P-P expansion: capacity underestimation • P-P compression: capacity overestimation • Noise in P-P distribution depends on cross traffic load • Example: Internet path with 1Mbps capacity

Multimodal packet pair distribution • Typically, P-P distribution includes several local modes • One of these modes (not always the strongest) is located at L/C • Sub-Capacity Dispersion Range (SCDR) modes: • P-P expansion due to common cross traffic packet sizes (e.g., 40B, 1500B) • Post-Narrow Capacity Modes (PNCMs): • P-P compression at links that follow narrow link

E2E Capacity estimation (b’):CapProbe Rohit Kapoor, Ling-Jyh Chen, Li Lao, Mario Gerla, M. Y. Sanadidi, "CapProbe: A Simple and Accurate Capacity Estimation Technique," ACM SIGCOMM 2004

Compression of p-p dispersion • First packet queueing => compressed dispersion • Capacity overestimation

Expansion of p-p dispersion • Second packet queueing => expansion of dispersion • Capacity underestimation

CapProbe • Both expansion and compression are result queuing delays • Key insight: packet pair that sees zero queueing delay would yield exact estimate • measure RTT for each probing packet of packet pair • estimate capacity from pair with minimum RTT-sum Capacity

Measurements - Internet, Internet2 • CapProbe implemented using PING packets, sent in pairs

Avail-bw estimation (a’):Pathload Manish Jain, Constantine Dovrolis, “End-to-End Available Bandwidth: Measurement Methodology, Dynamics, and Relation with TCP Throughput,” IEEE/ACM Transactions on Networking, Aug 2003.

Probing methodology • Sender transmits periodic packet stream of rate R • K packets, packet size L, interarrival T = L/R • Receiver measures One-Way Delay (OWD) for each packet • D(k) = tarv(k) - tsnd(k) • OWD variations: Δ(k) = D(k+1) – D(k) • Independent of clock offset between sender/receiver • With stationary & fluid-modeled cross traffic: • If R > A, then Δ(k) > 0 for all k • Else, Δ(k) = 0 for all k

Self-loading periodic streams • Increasing OWDs means R>A • Almost constant OWDs means R<A

Example of OWD variations • 12-hop path from U-Delaware to U-Oregon • K=100 packets, A=74Mbps, T=100μsec • Rleft = 97Mbps, Rright=34Mbps • Ri

Iterative probing and Pathload Avail-bw time series from NLANR trace • Iterative probing in Pathload: • Send multiple probing streams (duration τ) at various rates • Each stream samples path at different time interval • Outcome of each stream is either Ri > A or Ri < A • Estimate upper and lower bound for avail-bw variation range

Avail-bw estimation (b’): percentile estimation Manish Jain, Constantine Dovrolis, “End-to-End Estimation of the Available Bandwidth Variation Range,” ACM SIGMETRICS 2005

Avail-bw distribution and percentiles • Avail-bw random process, in timescale t: At(t) • Assume stationarity • Marginal distribution of At: • Ft(R) = Prob [At ≤ R] • Ap :pth percentile of At, such that p = Ft(Ap) • Objective: Estimate variation range [AL, AH] for given averaging timescale t • ALand AH are pL and pH percentiles of At • Typically, pL =0.10 and pH =0.90

Percentile sampling • Question : Which percentile of At corresponds to rate R? • Given R and t, estimate Ft(R) • Assume that Ft(R) is inversible • Sender transmits periodic packet stream of rate R • Length of stream = measurement timescale t • Receiver classifies stream as • Type-G if At ≤ R: I(R)= 1 with probability Ft(R) • Type-L otherwise: I(R)= 0 with probability 1-Ft(R) • Collect N samples of I(R) by sending N streams of rate R • Number of type-G streams: I(R,N)= ΣiIi(R) • E[ I(R,N) ] = Ft(R) * N • Percentile rank of probing rate R: Ft(R) = E[I(R,N)] / N • Use I(R,N)/N as estimator of Ft(R)

Non-parametric estimation • Does not assume specific avail-bw distribution • Iterative algorithm • Stationarity requirement across iterations • N-th iteration: probing rate Rn • Use percentile sampling to estimate percentile rank of Rn • To estimate the upper percentile AH with pH = Ft(AH): • fn = I(Rn,N)/N • If fn is between pH±r, report AH = Rn • Otherwise, • If fn > pH +r, set Rn+1 < Rn • If fn < pH -r, set Rn+1 > Rn • Similarly, estimate the lower percentile AL

Validation example • Verification using real Internet traffic traces b=0.05 b=0.15 • Non-parametric estimator tracks variation range within 10-20% • Selection of b depends on traffic • Traffic spikes/dips may not be detected if b is too small • But larger b causes larger MSRE

Parametric estimation • Assume Gaussian avail-bw distribution • Justified assumption for large degree of traffic multiplexing • And/or for long averaging timescale (>200msec) • Gaussian distribution completely specified by • Mean m and standard deviation st • pth percentile of Gaussian distribution • Ap = m + st f-1(p) • Sender transmits N probing streams of rates R1 and R2 • Receiver determines percentiles ranks corresponding to R1 and R2 • m and st can be then estimated by solving • R1 = m + st f-1(p1) • R2 = m + st f-1(p2) • Variation range is then calculated from: • AH = m + st f-1(pH) • AL = m + st f-1(pL)

Validation example • Verification using real Internet traffic traces • Evaluated with both Gaussian and non-Gaussian traffic Gaussian traffic non-Gaussian traffic • Parametric algorithm is more accurate than non-parametric algorithm, when • traffic is good match to Gaussian model • in non-stationary conditions

Avail-bw estimation (c’):PathChirp Vinay Ribeiro, Rolf Riedi, Jiri Navratil, Rich Baraniuk, Les Cottrell, “PathChirp: Efficient Available Bandwidth Estimation,” PAM 2003

Chirp Packet Trains • Exponentially decrease packet spacing within packet train • Wide range of probing rates • Efficient: few packets

Chirps vs. CBR Trains • Multiple rates in each chirping train • Allows one estimate per-chirp • Potentially more efficient estimation

CBR Cross-Traffic Scenario • Point of onset of increase in queuing delay gives available bandwidth

Comparison with Pathload • 100Mbps links • pathChirp uses 10 times fewer bytes for comparable accuracy

Experimental comparison of avail-bw estimation tools Alok Shriram, Marg Murray, Young Hyun, Nevil Brownlee, Andre Broido, k claffy, ”Comparison of Public End-to-End Bandwidth Estimation Tools on High-Speed Links,” PAM 2005

Testbed topology

Accuracy comparison - SmartBits Direction 1, Measured AB Direction 2, Measured AB Actual AB

Accuracy comparison - TCPreplay Actual Available Bandwidth Measured Available Bandwidth

What’s wrong with Spruce? • Spruce was proposed as fast & accurate estimation method • Strauss et al., IMC’03: example of direct probing • With a single-link model: • Sender sends periodic stream with input rate Ri • Receiver measures output rate Ro • Avail-bw A given by: • Assumptions: • Tight link capacity Ct is known • Input rate Ri can be controlled by source • But what would happen in a multi-hop path? • Ri can be less than source rate (due to previous link with lower capacity) and, even worse, it can be unknown!

Measurement latency comparison • Abing: 1.3 to 1.4 s • Spruce: 10.9 to 11.2 s • Pathload: 7.2 to 22.3 s • Patchchirp: 5.4 s • Iperf: 10.0 to 10.2 s

Probing traffic comparison

Applications of bandwidth estimation

Applications of bandwidth estimation • Large TCP transfers and congestion control • Bandwidth-delay product estimation • Socket buffer sizing • Streaming multimedia • Adjust encoding rate based on avail-bw • Intelligent routing systems • Overlay networks and multihoming • Select best path based on capacity or avail-bw • Content Distribution Networks (CDNs) • Choose server based on least-loaded path • SLA and QoS verification • Monitor path load and allocated capacity • End-to-end admission control • Network “spectroscopy” • Several more..

Application-1:Improved TCP Throughput with BwEst Ravi S. Prasad, Manish Jain, Constantine Dovrolis, “Socket Buffer Auto-Sizing for High-Performance Data Transfers,” Journal of Grid Computing, 2003

TCP performs poorly in fat-long paths • Basic idea in TCP congestion control: • Increase congestion window until packet loss occurs • Then reduce window by a factor of two (multiplicative decrease) • Consequences • Increased loss-rate • Avg. throughput less than available bandwidth • Large delay-variation • Example: 1Gbps path, 100msec RTT, 1500B pkts • Single loss ≈ 4000 pkts window reduction • Recovery from single loss ≈ 400 sec • Need very small loss probability (ρ<2*10-8) to saturate path

Sender Receiver Socket Buffer Socket Layer TCP Layer Ws Wr TCP background • Socket-layer buffers • Send buffer: Bs • Receive buffer: Br • TCP windows • Send: Ws <= Bs • Congestion: Wc • Receiver (advertised): Wr<= Br • Ws = min {Bs, Wc, Wr} • Ideally, TCP send window Ws should be set to connection’s bandwidth-delay product • “Bandwidth”: connection’s fair share • “Delay”: connection’s RTT • Achieve efficiency, fairness, no congestion

Can we avoid congestive losses w/o changing TCP? • Ws = min {Wc, Wr} • But Wr<= S • S: rcv socket buffer size • We can limit S so that connection is limited by receive window: Ws = S • Non-congested path: • Throughput R(S) = S/T(S) • R(S) increases with S until it reaches MFT • MFT: Maximum Feasible Throughput (onset of congestion) • Objective: set S close to MFT, to avoid any losses and stabilize TCP send window to a safe value • Congested path: • Path with losses before start of target transfer • Limiting S can only reduce throughput

Socket buffer auto-sizing (SOBAS) • Apply SOBAS only in non-congested paths • Receiving application measures rcv-throughput Rrcv • When receive throughput becomes constant: • Connection has reached its maximum throughput Rmax • If the send window becomes any larger, losses will occur • Receiving application limits rcv socket buffer size S: • S = RTT * Rmax • With congestion unresponsive cross traffic: Rmax = A • With congestion responsive cross traffic: Rmax = MFT • SOBAS does not require any changes in TCP • But estimation of RTT and congestion-status requires “ping-like” probing

Bandwidth estimation in computer networks: measurement techniques & applications