270 likes | 409 Views
TCP Congestion Control and Common AQM Schemes: Quick Revision. Shivkumar Kalyanaraman Rensselaer Polytechnic Institute shivkuma@ecse.rpi.edu http://www.ecse.rpi.edu/Homepages/shivkuma Based in part upon slides of Prof. Raj Jain (OSU), Srini Seshan (CMU), J. Kurose (U Mass), I.Stoica (UCB).
E N D
TCP Congestion Control and Common AQM Schemes: Quick Revision Shivkumar Kalyanaraman Rensselaer Polytechnic Institute shivkuma@ecse.rpi.edu http://www.ecse.rpi.edu/Homepages/shivkuma Based in part upon slides of Prof. Raj Jain (OSU), Srini Seshan (CMU), J. Kurose (U Mass), I.Stoica (UCB)
TCP Congestion Control Model and Mechnisms • TCP Versions: Tahoe, Reno, NewReno, SACK, Vegas etc • AQM schemes: common goals, RED, … Overview
TCP Congestion Control • Maintains three variables: • cwnd – congestion window • rcv_win – receiver advertised window • ssthresh – threshold size (used to update cwnd) • Rough estimate of knee point… • For sending use: win = min(rcv_win, cwnd)
Pr Pb Sender Receiver Ab As Ar Packet Conservation: Self-clocking • Implications of ack-clocking: • More batching of acks => bursty traffic • Less batching leads to a large fraction of Internet traffic being just acks (overhead)
TCP: Slow Start • Whenever starting traffic on a new connection, or whenever increasing traffic after congestion was experienced: • Set cwnd =1 • Each time a segment is acknowledged increment cwnd by one (cwnd++). • Does Slow Start increment slowly? Not really. In fact, the increase of cwnd is exponential!! • Window increases to W in RTT * log2(W)
segment 1 cwnd = 1 ACK for segment 1 segment 2 segment 3 ACK for segments 2 + 3 segment 4 segment 5 segment 6 segment 7 ACK for segments 4+5+6+7 Slow Start Example • The congestion window size grows very rapidly • TCP slows down the increase of cwnd when cwnd >= ssthresh cwnd = 2 cwnd = 4 cwnd = 8
Slow Start Sequence Plot . . . Sequence No Window doubles every round Time
Congestion Avoidance • Goal: maintain operating point at the left of the cliff: • How? • additive increase: starting from the rough estimate (ssthresh), slowly increase cwnd to probe for additional available bandwidth • multiplicative decrease: cut congestion window size aggressively if a loss is detected. • Ifcwnd > ssthreshtheneach time a segment is acknowledged increment cwnd by 1/cwnd i.e. (cwnd += 1/cwnd).
Fairness Line x1 x0 User 2’s Allocation x2 x2 Efficiency Line User 1’s Allocation x1 Additive Increase/Multiplicative Decrease (AIMD) Policy • Assumption: decrease policy must (at minimum) reverse the load increase over-and-above efficiency line • Implication: decrease factor should be conservatively set to account for any congestion detection lags etc
Congestion Avoidance Sequence Plot Sequence No Window grows by 1 every round Time
Slow Start/Congestion Avoidance Eg. • Assume thatssthresh = 8 ssthresh Cwnd (in segments) Roundtriptimes
Putting Everything Together:TCP Pseudo-code Initially: cwnd = 1; ssthresh = infinite; New ack received: if (cwnd < ssthresh) /* Slow Start*/ cwnd = cwnd + 1; else /* Congestion Avoidance */ cwnd = cwnd + 1/cwnd; Timeout: (loss detection) /* Multiplicative decrease */ ssthresh = win/2; cwnd = 1; while (next < unack + win) transmit next packet; where win = min(cwnd, flow_win); unack next seq # win
The big picture cwnd Timeout Congestion Avoidance Slow Start Time
Packet Loss Detection: Timeout Avoidance • Wait for Retransmission Time Out (RTO) • What’s the problem with this? • Because RTO is a performance killer • In BSD TCP implementation, RTO is usually more than 1 second • the granularity of RTT estimate is 500 ms • retransmission timeout is at least two times of RTT • Solution: Don’t wait for RTO to expire • Use fast retransmission/recovery for loss detection • Fall back to RTO only if these mechanisms fail. • TCP Versions: Tahoe, Reno, NewReno, SACK
TCP Congestion Control Summary • Sliding window limited by receiver window. • Dynamic windows: slow start (exponential rise), congestion avoidance (additive rise), multiplicative decrease. • Ack clocking • Adaptive timeout: need mean RTT & deviation • Timer backoff and Karn’s algo during retransmission • Go-back-N or Selective retransmission • Cumulative and Selective acknowledgements • Timeout avoidance: Fast Retransmit
Queuing Disciplines • Each router must implement some queuing discipline • Queuing allocates bandwidth and buffer space: • Bandwidth: which packet to serve next (scheduling) • Buffer space: which packet to drop next (buff mgmt) • Queuing also affects latency Traffic Sources Traffic Classes Class A Class B Class C Drop Scheduling Buffer Management
Typical Internet Queuing • FIFO + drop-tail • Simplest choice • Used widely in the Internet • FIFO (first-in-first-out) • Implies single class of traffic • Drop-tail • Arriving packets get dropped when queue is full regardless of flow or importance • Important distinction: • FIFO: scheduling discipline • Drop-tail: drop (buffer management) policy
FIFO + Drop-tail Problems • FIFO Issues: In a FIFO discipline, the service seen by a flow is convoluted with the arrivals of packets from all other flows! • No isolation between flows: full burden on e2e control • No policing: send more packets get more service • Drop-tail issues: • Routers are forced to have have large queues to maintain high utilizations • Larger buffers => larger steady state queues/delays • Synchronization: end hosts react to same events because packets tend to be lost in bursts • Lock-out: a side effect of burstiness and synchronization is that a few flows can monopolize queue space
Queue Management Ideas • Synchronization, lock-out: • Random drop: drop a randomly chosen packet • Drop front: drop packet from head of queue • High steady-state queuing vs burstiness: • Early drop: Drop packets before queue full • Do not drop packets “too early” because queue may reflect only burstiness and not true overload • Misbehaving vs Fragile flows: • Drop packets proportional to queue occupancy of flow • Try to protect fragile flows from packet loss (eg: color them or classify them on the fly) • Drop packets vs Mark packets: • Dropping packets interacts w/ reliability mechanisms • Mark packets: need to trust end-systems to respond!
Packet Drop Dimensions Aggregation Single class Per-connection state Class-based queuing Drop position Tail Head Random location Early drop Overflow drop
Random Early Detection (RED) Min thresh Max thresh Average Queue Length P(drop) 1.0 maxP minth maxth Avg queue length
Random Early Detection (RED) • Maintain running average of queue length • Low pass filtering • If avg Q < minth do nothing • Low queuing, send packets through • If avg Q > maxth, drop packet • Protection from misbehaving sources • Else mark (or drop) packet in a manner proportional to queue length & bias to protect against synchronization • Pb = maxp(avg - minth) / (maxth - minth) • Further, bias Pb by history of unmarked packets • Pa = Pb/(1 - count*Pb)
RED Issues • Issues: • Breaks synchronization well • Extremely sensitive to parameter settings • Wild queue oscillations upon load changes • Fail to prevent buffer overflow as #sources increases • Does not help fragile flows (eg: small window flows or retransmitted packets) • Does not adequately isolate cooperative flows from non-cooperative flows • Isolation: • Fair queuing achieves isolation using per-flow state • RED penalty box: Monitor history for packet drops, identify flows that use disproportionate bandwidth
REMAthuraliya & Low 2000 • Main ideas • Decouple congestion & performance measure • “Price” adjusted to match rate and clear buffer • Marking probability exponential in `price’ REM RED 1 Avg queue
DropTail queue = 94% REM queue = 1.5 pkts utilization = 92% g = 0.05, a = 0.4, f = 1.15 RED min_th = 10 pkts max_th = 40 pkts max_p = 0.1 Comparison of AQM Performance
4w/3 w = (4w/3+2w/3)/2 2w/3 2w/3 Area = 2w2/3 What is TCP Throughput? window • Each cycle delivers 2w2/3 packets • Assume: each cycle delivers 1/p packets = 2w2/3 • Delivers 1/p packets followed by a drop • => Loss probability = p/(1+p) ~ p if p is small. • Hence t
Law • Equilibrium window size • Equilibrium rate • Empirically constant a ~ 1 • Verified extensively through simulations and on Internet • References • T.J.Ott, J.H.B. Kemperman and M.Mathis (1996) • M.Mathis, J.Semke, J.Mahdavi, T.Ott (1997) • T.V.Lakshman and U.Mahdow (1997) • J.Padhye, V.Firoiu, D.Towsley, J.Kurose (1998)