350 likes | 363 Views
End-to-End Estimation of Available Bandwidth Variation Range. Constantine Dovrolis Joint work with Manish Jain & Ravi Prasad College of Computing Georgia Institute of Technology. Probing the Internet. Several network parameters are important for applications and transport protocols:
E N D
End-to-End Estimation of Available Bandwidth Variation Range Constantine Dovrolis Joint work with Manish Jain & Ravi Prasad College of Computing Georgia Institute of Technology
Probing the Internet • Several network parameters are important for applications and transport protocols: • Delay, loss rate, capacity, congestion, load, etc • Internet routers do not provide direct feedback to end-hosts • Due to scalability, simplicity & administrative issues • Except SNMP, ICMP • Alternatively: • Infer network state through end-to-end measurements
End-to-end bandwidth estimation • “Bandwidth” in data networks refers to throughput(bits/sec) • Capacity: maximum possible throughput w/o cross traffic • Available bandwidth (or residual capacity): capacity – cross traffic • Bandwidth estimation: measurement techniques & statistical analysis to infer bandwidth-related metrics of individual links and end-to-end network paths • Objectives: • Accuracy: application-specific but typically within 10-20% • Estimation latency: within a few seconds • Non-intrusiveness: cross traffic should not be affected • Scalability: important when monitoring many paths (not covered in this talk)
Why to measure bandwidth? • Large TCP transfers and congestion control • Bandwidth-delay product estimation • TCP socket buffer sizing • Streaming multimedia • Adjust encoding rate based on avail-bw • Intelligent routing systems • Overlay networks and p2p networks • Intelligent routing control & multihoming • Content Distribution Networks (CDNs) • Choose server based on least-loaded path • SLA verification & interdomain problem diagnosis • Monitor path load and allocated capacity • End-to-end admission control • Network spectroscopy • Several more..
Capacity • Maximum possible end-to-end throughput at IP layer • In the absence of any cross traffic • For maximum-sized packets • If Ci is capacity of link i, end-to-end capacity C defined as: • Capacity determined by narrow link
Average available bandwidth • Per-hop average avail-bw: • Ai = Ci (1-ui) • ui: average utilization • A.k.a. residual capacity • End-to-end avg avail-bw A: • Determined by tight link • ISPs measure average per-hop avail-bw passively • 5-min averaging intervals
Avail-bw variability • Avail-bw has significant variability • Variability depends on averaging timescale t • Larger timescale, lower variance • Variation range: • Range between, say, 10th to 90th percentiles • Example: • Path-1: variation range [10Mbps, 90Mbps] • Path-2: variation range [20Mbps, 20Mbps] • Which path would you prefer?
The avail-bw as a random process • Instantaneous utilization ui(t): either 0 or 1 • Link utilization in (t, t+t) • Averaging timescale: t • Available bandwidth in (t, t+t) • End-to-end available bandwidth in (t, t+t)
Problem statement • Avail-bw random process, measured in timescale t: At(t) • Assuming stationarity, marginal distribution of At: • Ft(R) = Prob [At ≤ R] • Ap :pth percentile of At, such that p = Ft(Ap) • Objective: Estimate variation range [AL, AH] for given averaging timescale t • ALand AH are pL and pH percentiles of At • Typically, pL =0.10 and pH =0.90
Probing a network path • Sender transmits periodic packet stream of rate R • K packets, packet size L, interarrival T = L/R • Receiver measures One-Way Delay (OWD) for each packet • D(k) = tarv(k) - tsnd(k) • OWD variations: Δ(k) = D(k+1) – D(k) • Independent of clock offset between sender/receiver • With stationary & fluid-modeled cross traffic: • If R > A, then Δ(k) > 0 for all k • Else, Δ(k) = 0 for all k
Self-loading periodic streams • Increasing OWDs means R>A • Non-increasing OWDs means R<A
Example of OWD variations • 12-hop path from U-Delaware to U-Oregon • K=100 packets, A=74Mbps, T=100μsec • Rleft = 97Mbps, Rright=34Mbps
Percentile sampling & estimation algorithms
Percentile sampling • Given R and t, estimate Ft(R) • Ft(R) is also referred to as the rank of rate R • Assume that Ft(R) is inversible • Sender transmits a periodic packet stream of rate R • Length of stream: measurement timescale t • Receiver classifies the stream, based on measured one-way delay trends, as: • Type-G if At ≤ R: • I(R)= 1 with probability Ft(R) • Type-L if At > R: • I(R)= 0 with probability 1-Ft(R)
Percentile sampling (cont’) • Send N packet streams, and classify each packet stream as • Type-G if At ≤ R: • I(R)= 1 with probability Ft(R) • Type-L if At > R: • I(R)= 0 with probability 1-Ft(R) • Number of type-G streams: • Unbiased estimator for the rank of rate R:
How many streams do we need? • Larger N longer estimation duration • Smaller N larger variance in estimator I(R,N)/N • Choose N so that: • I(R,N)/N within Ft(R)± r • r:maximum percentile error • P[N(p-r) < I(R,N) < N(p+r)] > 1-e • where p= Ft(R) and e small • I(R,N) ~ Binomial (N, p) assuming independent sampling • With N=40-50 streams, the maximum percentile error r for 10th-90th variation range is about 0.05
Non-parametric estimation • It does not assume specific avail-bw distribution • Iterative algorithm • Stationarity requirement across iterations • N-th iteration: probing rate Rn • Use percentile sampling to estimate percentile rank of Rn • To estimate the upper percentile AH with pH = Ft(AH): • fn = I(Rn,N)/N • If fn is between pH±r, report AH = Rn • Otherwise, • If fn > pH +r, set Rn+1 < Rn • If fn < pH -r, set Rn+1 > Rn • Similarly, estimate the lower percentile AL
Non-parametric algorithm • Parameter b • Upper bound on rate variation in successive iterations • Tradeoff between accuracy and responsiveness • Larger b: • Faster convergence • Larger oscillations
Validation example (non-parametric) • Testbed experiments using real Internet traffic traces b=0.05 b=0.15 • Non-parametric estimator tracks variation range within 10-20% • Optimal selection of b depends on traffic • Traffic spikes/dips may not be detected if b is too small • But larger b causes larger MSRE
Parametric estimation • Assume Gaussian avail-bw distribution • Justified assumption for large degree of traffic multiplexing • And/or for long averaging timescale (>200msec) • Gaussian distribution completely specified by • Mean m and standard deviation st • pth percentile of Gaussian distribution • Ap = m + st f-1(p) • Sender transmits N probing streams of rates R1 and R2 • Receiver determines percentiles ranks corresponding to R1 and R2 • m and st can be then estimated by solving • R1 = m + st f-1(p1) • R2 = m + st f-1(p2) • Variation range is then calculated from: • AH = m + st f-1(pH) • AL = m + st f-1(pL)
Parametric algorithm • Variation range estimate • Non-iterative algorithm • More appropriate under non-stationary conditions • Probing rates do not need to follow variation range • Less intrusive probing
Validation example (parametric) Gaussian traffic non-Gaussian traffic • Parametric algorithm is more accurate than non-parametric algorithm, when • traffic is good match to Gaussian model • in non-stationary conditions
Comparison of the two algorithms Non-parametric: t = 40msec Parametric: t = 250msec • Non-parametric algorithm • Stationarity assumption is more critical (iterative algorithm) • Can be used with any cross traffic distribution • Parametric algorithm • Provides variation range estimate at end of each round • Accurate when underlying traffic close to Gaussian
A sample measurement from the Internet • Path from Georgia Tech to University of Ioannina, Greece • Average avail-bw increases over this 2-hour period • Variation range decreases as average avail-bw increases
Objectives and methodology • Examine effect of following factors on avail-bw variability: • Load at tight link • Degree of multiplexing at tight link • Averaging time scale • Single-hop simulation topology with TCP traffic • Monitore load at tight link • Examine variation range width V • V = AtH - AtL • Compare V with Coefficient of Variation (CoV) • CoV : standard deviation (at time scalet) over average avail-bw • V : Absolute variability metric • CoV : Relative variability metric
Tight Link Utilization • Variation range width V shows non-monotonic behavior • V increases in low/medium load, due to increasing variance in input traffic (tight link rarely saturated) • V decreases in heavy load due to “clamping” by tight link capacity • CoV increases monotonically with load
Statistical Multiplexing • Conventional wisdom: • Keeping the load constant, higher degree of multiplexing makes the traffic smoother • Two models for increasing degree of multiplexing • Capacity Scaling • Increase capacity of tight link and proportionally increase number of flows • Average flow rate remains constant • Flow Scaling • Increase number of flows and proportionally decrease average flow rate • Capacity of tight link remains constant
Capacity Scaling • Variation range width V increases with capacity scaling • CoV decreases with capacity scaling • Conventional wisdom true for relative variability (CoV) but not for absolute variation range (V)
Flow Scaling • Variation range decreases in both absolute and relative terms
Measurement Timescale • Avail-bw variability decreases with averaging time scale • Rate of decrease depends on correlation structure of avail-bw process • Observed decrease rate consistent with scaling process in the 50-500ms (Hurst parameter=0.7)
Future work • Applications of bandwidth estimation: • Overlay routing and multihoming: path selection algorithms, avoidance of oscillations, provisioning • Interdomain performance problem diagnosis • TCP throughput prediction (see ACM Sigcomm’05) • Internet traffic analysis: • Use of bw-estimation to explain traffic burstiness in short time scales (see ACM Sigmetrics’05) • Examine validity of single-bottleneck assumption • Examine congestion responsiveness of Internet traffic • New estimation problems: • Detect maximum possible shared available bandwidth among set of network paths