330 likes | 549 Views
Coping with (exploiting) heavy tails. Balaji Prabhakar Departments of EE and CS Stanford University balaji@stanford.edu. Balaji Prabhakar. Overview. SIFT: Asimple algorithm for identifying large flows reducing average flow delays with smaller router buffers
E N D
Coping with (exploiting) heavy tails Balaji Prabhakar Departments of EE and CSStanford University balaji@stanford.edu Balaji Prabhakar
Overview • SIFT: Asimple algorithm for identifying large flows • reducing average flow delays • with smaller router buffers • with Konstantinos Psounis and Arpita Ghosh • Bandwidth at wireless proxy servers • TDM vs FDM • or, how may servers suffice • with Pablo Molinero-Fernandez and Konstantinos Psounis
SIFT: Motivation • Egress buffers on router line cards at present serve packets in a FIFO manner • The bandwidth sharing that results from this and the actions of transport protocols like TCP translates to some service order for flows that isn’t well understood; that is, at the flow level do we have: • FIFO? PS? SRPT? (none of the above) Egress Buffer
SIFT: Motivation • But, serving packets according to the SRPT (Shortest Remaining Processing Time) policy at the flow level • would minimize average delay • given the heavy-tailed nature of Internet flow size distribution, the reduction in delay can be huge
SRPT at the flow level • Next packet to depart under FIFO • green • Next packet to depart under SRPT • orange Egress Buffer
But … • SRPT is unimplementable • router needs to know residual flow sizes for all enqueued flows: virtually impossible to implement • Other pre-emptive schemes like SFF (shortest flow first) or LAS (least attained service) are like-wise too complicated to implement • This has led researchers to consider tagging flows at the edge, where the number of distinct flows is much smaller • but, this requires a different design of edge and core routers • more importantly, needs extra space on IP packet headers to signal flow size • Is something simpler possible?
SIFT: A randomized algorithm • Flip a coin with bias p (= 0.01, say) for heads on each arriving packet, independently from packet to packet • A flow is “sampled” if one its packets has a head on it T T T T T H H
SIFT: A randomized algorithm • A flow of size X has roughly 0.01Xchance of being sampled • flows with fewer than 15 packets are sampled with prob 0.15 • flows with more than 100 packets are sampled with prob 1 • the precise probability is: 1 – (1-0.01)X • Most short flows will not be sampled, most long flows will be
The accuracy of classification • Ideally, we would like to sample like the blue curve • Sampling with prob p gives the red curve • there are false positives and false negatives • Can we get the green curve? Prob (sampled) Flow size
SIFT+ • Sample with a coin of bias q = 0.1 • say that a flow is “sampled” if it gets two heads! • this reduces the chance of making errors • but, you have to have a count the number heads • So, how can we use SIFT at a router?
SIFT at a router • Sample incoming packets • Place any packet with a head (or the second such packet) in the low priority buffer • Place all further packets from this flow in the low priority buffer (to avoid mis-sequencing)
Simulation results • Simulation results with ns-2 • Topology:
Implementation requirements • SIFT needs • two logical queues in one physical buffer • to sample arriving packets • a table for maintaining id of sampled flows • to check whether incoming packet belongs to sampled flow or not • all quite simple to implement
A big bonus • The buffer of the short flows has very low occupancy • so, can we simply reduce it drastically without sacrificing performance? • More precisely, suppose • we reduce the buffer size for the small flows, increase it for the large flows, keep the total the same as FIFO
SIFT incurs fewer drops Buffer_Size(Short flows) = 10; Buffer_Size(Long flows) = 290; Buffer_Size(Single FIFO Queue) = 300; SIFT ------ FIFO ------
Reducing total buffer size • Suppose we reduce the buffer size of the long flows as well • Questions: • will packet drops still be fewer? • will the delays still be as good?
Drops under SIFT with less total buffer Buffer_Size(PRQ0) = 10; Buffer_Size(PRQ1) = 190; Buffer_Size(One Queue) = 300; SIFT ------ FIFO ------ OneQueue
Delay histogram for short flows SIFT ------ FIFO ------
Delay histogram for long flows SIFT ------ FIFO ------
Conclusions for SIFT • A randomized scheme, preliminary results show that • it has a low implementation complexity • it reduces delays drastically (users are happy) • with 30-35% smaller buffers at egress line cards (router manufacturers are happy) • Lot more work needed • at the moment we have a good understanding of how to sample, and extensive (and encouraging) simulation tests • need to understand the effect of reduced buffers on end-to-end congestion control algorithms
How many servers do we need? • Motivation: Wireless and satellite • Problem: Single transmitter, multiple receivers • bandwidth available for transmission: W bits/sec • should files be transferred to one receiver at a time? (TDM) • or, should we divide the bandwidth into K channels of W/K bits/sec and transmit to K receivers at a time? (FDM) • For heavy-tailed jobs, K > 1 minimizes mean delay • Questions: • What is the right choice of K? • How does it depend on flow-size distributions?
A simulation: HT file sizes Average response time (s) Number of servers (K)
The model • Use an M/Heavy-Tailed/K queueing system • service times X: bimodal to begin with, generalizes P(X=A) = a = 1-P(X=B), where A < E(X) ¿ B and a ¼ 1 • the arrival rate is l • Let SK, WK and DK be the service time, waiting time and total delay in K server system • E(SK) = K E(X); E(DK) = K E(X) + E(WK) • Main idea in determining K* • have enough servers to take care of long jobs so that short jobs aren’t waiting for long amounts of time • but no more because, otherwise, the service times become big
Approximately determining WK • Consider two states: servers blocked or not • Blocked: all K servers are busy serving long jobs E(WK) = E[WK|blocked] PB + E[WK|unblocked] (1-PB) • PB¼ P(there are at least K large arrivals in KB time slots) • this is actually a lower bound, but accurate for large B • with Poisson arrivals, PB is easy to find out • E(WK|unblocked) ¼ 0 • E(WK|blocked) ¼ E(W1), which can be determined from the P-K formula
M/Bimodal/K a =0.9995, r =0.50 Average response time (s) Number of servers (K)
M/Pareto/K g =1.1, r =0.50 Average response time (s) Number of servers (K)
M/Pareto/K: higher moments g =1.1, r =0.50 Standard deviation of response time (s) Number of servers (K)