500 likes | 580 Views
Lecture 4: Congestion Control. Challenge: how do we efficiently share network resources among billions of hosts? Today: TCP Hosts adjust rate based on packet losses Next time: Alternative solutions Fair queueing, RED Vegas, packet pair ECN, rate control, credits.
E N D
Lecture 4: Congestion Control • Challenge: how do we efficiently share network resources among billions of hosts? • Today: TCP • Hosts adjust rate based on packet losses • Next time: Alternative solutions • Fair queueing, RED • Vegas, packet pair • ECN, rate control, credits
A brief Internet history... 1991 WWW/HTTP 1990 ARPANET dissolved 1972 TELNET RFC 318 1995 Multi-backbone Internet 1986 NNTP RFC 977 1992 MBONE 1969 ARPANET created 1984 DNS RFC 883 1973 FTP RFC 454 1977 MAIL RFC 733 1982 TCP & IP RFC 793 & 791 1970 1980 1985 1990 1995 1975
TCP: This is your life... 1984 Nagel’s algorithm to reduce overhead of small packets; predicts congestion collapse 1975 Three-way handshake Raymond Tomlinson In SIGCOMM 75 1987 Karn’s algorithm to better estimate round-trip time 1990 4.3BSD Reno fast retransmit delayed ACK’s 1983 BSD Unix 4.2 supports TCP/IP 1988 Van Jacobson’s algorithms congestion avoidance and congestion control (most implemented in 4.3BSD Tahoe) 1986 Congestion collapse observed 1974 TCP described by Vint Cerf and Bob Kahn In IEEE Trans Comm 1982 TCP & IP RFC 793 & 791 1980 1990 1985 1975
TCP: After 1990 1994 T/TCP (Braden) Transaction TCP 1996 SACK TCP (Floyd et al) Selective Acknowledgement 1999 TCP Rainier ??? 1996 FACK TCP (Mathis et al) extension to SACK 1996 Hoe Improving TCP startup 1993 TCP Vegas (Brakmo et al) real congestion avoidance 1994 ECN (Floyd) Explicit Congestion Notification 1994 1993 1996
TCP Flow Control • TCP is a sliding window protocol • For window size n, can send up to n bytes without receiving an acknowledgement • When the data is acknowledged then the window slides forward • Each packet advertises a window size • Indicates number of bytes the receiver is willing to get • Original TCP always send entire window
Visualizing the window offered window (advertised by receiver) usable window 1 2 3 4 5 6 7 8 9 10 11 12 sent and acknowledged can send ASAP can’t send until window moves sent, not ACKed Left side of window advances when data is acknowledged. Right side controlled by size of window advertisement
Observed TCP Problems • Too many small packets • silly window syndrome • Nagel’s algorithm • delayed acks • Too many retransmissions • dynamic retransmission timer • congestion collapseb
Silly Window Syndrome • Problem: (Clark, 1982) • If receiver advertises small increases in the receive window then the sender may waste time sending lots of small packets • Solution • Receiver must not advertise small window increases • Increase window by min(MSS,RecvBuffer/2)
Nagel’s algorithm • Small packet problem: • Don’t want to send a 41 byte packet for each keystroke • How long to wait for more data? • Solution: • Allow only one outstanding small segment that has not yet been acknowledged • Self-adaptive: faster that ACKs arrive, more tinigrams can be sent
Retransmission • How long a timeout to set? • Estimate round-trip time R • R = R + (1- )M • is a smoothing factor of 0.9 • M is measured round-trip time (ACK’s to data) • Timeout at R, where is a delay variance factor of 2.0
Karn’s algorithm • Problem: • How to estimate RTT when data is retransmitted? • Solution: • Don’t sample RTT when you retransmit data • Also added exponential backoff to retransmit interval
Crossbar aaaaa bbbbb ccccc ddddd Router Congestion • What if rate of packets arriving > rate of packets departing? • Routers drop any overflow packets adcba
The Problem • Original TCP sent full window of data • When links become loaded, queues fill up, and this can lead to: • Congestion collapse: when round-trip time exceeds retransmit interval this can create a stable condition in which every packet is being retransmitted many times • Synchronized behavior: network oscillates between loaded and unloaded
Jacobson Solution • Modify retransmission timer to adapt to variations in queueing delay • Infer network bandwidth from packet loss • drops => congestion => reduce rate • drops also caused by link noise! • no drops => no congestion => increase rate • Limit send rate based on network bandwidth as well as receiver space
Jacobson’s RTT estimator • Problem: • Can’t deal with wide variance in RTT • Solution: • Smooth estimator of variance as well Err = M - R R = R + gErr ( gain for avg = 1/8) D = D + h(|Err| - D) ( gain for dev = 1/4) • Timeout at R + 4D • When variance is small, R dominates
TCP Congestion Control • Adjust rate to match network bandwidth • Additive increase/multiplicative decrease • oscillate around bottleneck capacity • Slow start • quickly identify bottleneck capacity • Fast retransmit • Fast recovery
Tracking the Bottleneck Bandwidth • Throughput = window size/RTT • Multiplicative decrease • Timeout => dropped packet => cut window size in half • Additive increase • Ack arrives => no drop => increase window size by one packet/window
TCP “Sawtooth” • Oscillates around bottleneck bandwidth • adjusts to changes in competing traffic
Slow start • How do we find bottleneck bandwidth? • Start by sending a single packet • start slow to avoid overwhelming network • Multiplicative increase until get packet loss • quickly find bottleneck • Previous window < bottleneck rate • shift into linear increase/multiplicative decrease
Slow Start • Quickly find the bottleneck bandwidth
Slow Start Problems • Slow start usually overshoots bottleneck • will lose many packets in window • Bursty traffic source • will cause bursty losses for other flows • Short flows • Can spend entire time in slow start!
Avoiding burstiness: ack pacing bottleneck packets Sender Receiver acks Window size = round trip delay * bit rate
Problems with Ack Pacing • Ack compression • Variations in queueing delays on return path changes spacing between acks • Example: ack waits for single long packet • Worse with bursty cross-traffic • What happens after a timeout?
Ack Pacing After Timeout 1 • Packet loss disrupts ack pacing • slow start/additive increase cause packet loss! • After loss, use slow start to regain ack pacing • switch to linear increase at last successful rate • “congestion avoidance” 2 3 1 1 4 5 1 1 1 2 5
Putting It All Together • Timeouts dominate performance!
Fast Retransmit 1 • Can we detect packet loss without a timeout? • Duplicate acks imply either • packet reordering (route change) • packet loss • TCP Tahoe • resend if see three dup acks • eliminates timeout delay 2 3 1 1 4 5 1 1 1 2 5
Fast Retransmit Caveats • Requires in order packet delivery • Dynamically adjust number of dup acks needed for retransmit • Doesn’t work with small windows • ex: modems! • Doesn’t work if packets lost in burst • ex: at peak of slow start?
Fast Retransmit • Regaining ack pacing limits performance
Fast Recovery 1 • Use duplicate acks to maintain ack pacing • dup ack => packet left network • every other ack => send packet • Doesn’t work if lose packets in burst • timeout and slow start to reestablish ack pacing 2 3 1 1 4 5 1 1 1 2 3
Delayed ACKS • Problem: • In request/response programs, you send separate ACK and Data packets for each transaction • Solution: • Don’t ACK data immediately • Wait 200ms (must be less than 500ms) • Must ACK every other packet • Must not delay duplicate ACKs
Delayed Ack Impact • TCP congestion control triggered by acks • if receive half as many acks => window grows half as fast • Slow start with window = 1 • will trigger delayed ack timer
What if two TCP connections share link? • Reach equilibrium independent of initial bandwidth (assuming equal RTTs)
What if TCP and UDP share link? • Independent of initial rates, UDP will get priority! TCP will take what’s left.
What if two different TCP implementations share link? • If cut back more slowly after drops => will grab bigger share • If add more quickly after acks => will grab bigger share • Incentive to cause congestion collapse! • Many TCP “accelerators” • Easy to improve perf at expense of network • Solution: per-flow fair queueing at router
What if TCP connection is short? • Slow start dominates performance • What if network is unloaded? • Burstiness causes extra drops • Packet losses unreliable indicator • can lose connection setup packet • can get drop when connection near done • signal unrelated to sending rate • In limit, have to signal every connection • 50% loss rate as increase # of connections
Example: 10KB document10Mb/s Ethernet,70ms RTT, 536 MSS • Ethernet ~ 10 Mb/s • 64KB window, 70ms RTT ~ 7.5 Mb/s • can only use 10KB window ~ 1.2 Mb/s • 5% drop rate ~ 275 Kb/s (Matthis) • model timeouts ~ 228 Kb/s (Padhye) • slow start, no losses ~ 140 Kb/s • slow start, with 5% drop ~ 75 Kb/s
Short flow bandwidth Flow length=10Kbytes, RTT=70ms
Sharing congestion information • Intra-host sharing • Multiple web connections from a host • [Padmanabhan98, Touch97] • Inter-host sharing • For a large server farm or a large client population • How much potential is there?
Sharing Congestion Information Enterprise/Campus Network Border Router Congestion Gateway Subnet Internet
Matthis et al. • Problem: TCP has complex behavior • Can we model its performance? • Make lots of simplifying assumptions • one drop per window • long flow behavior (ignore slow start) • large receiver window • no timeouts • Result
What does this mean? • Bigger RTT => worse performance • For shared links, TCP penalizes WAN traffic • Queueing delay hurts performance • Higher drop rate => worse performance • correlated vs. non-correlated drops • does my behavior cause more drops for me, or just more drops for other flows?
Does adding buffers help? • ++ Increases ability of router to handle bursts => fewer drops • -- Allows greater overshoot in slow start => bursty traffic => more drops • -- Potentially increases RTT
Balakrishnan et al. • Observation: poor performance on wireless links (<= 10% of peak) • Problem: Interaction with TCP • Wireless links drop pkts due to noise, handoffs • TCP interprets all drops as congestion • =>TCP will reduce sending rate • Paper’s goal: Evaluate proposed solutions
Alternatives • Modify TCP implementation • better handling of multiple drops per window • even faster retransmit (one dup ack) • explicit bit for non-congestion packet loss • Retransmission at link layer • Can result in packet reordering • TCP-aware suppresses duplicate acks • Split connection
Modify TCP Implementation • E2E: standard TCP • NewReno: if multiple drops in window, first ack after fast recovery will be “partial” • Selective ACK: list recent missing pkts • IETF -- wait for 3 dup acks • SMART -- resend after one dup ack • ELN: explicit bit for non-congestion loss • ELN + SMART not simulated
Link Layer Solutions • Link layer: transparent retransmission • end host sees packets arrive out of order • TCP aware link layer • suppress duplicate acks • Link layer with SACK • better handling of multiple drops • Link layer with SACK and TCP aware • bingo!
Split TCP • Install proxy at wireless link • allows local retransmissions • But • proxy failure visible to end hosts • end to end xput limited to min across hops