380 likes | 542 Views
Router - assisted Congestion Control RED, ECN, and XCP. 2007. Outline. Active Queue Mgmt. (AQM) and Random Early Detection (RED) Explicit congestion notification (ECN) eXplicit Control Protocol (XCP). Router Support For Congestion Management. Traditional Internet
E N D
Outline • Active Queue Mgmt. (AQM) and Random Early Detection (RED) • Explicit congestion notification (ECN) • eXplicit Control Protocol (XCP)
Router Support For Congestion Management • Traditional Internet • Congestion control mechanisms at end-systems, mainly implemented in TCP • Routers play little role • Traditional routers • FIFO • Tail drop • You need both end-host congestion control and router support for congestion control • End-host congestion control to adapt • Router congestion control to protect / isolate
What Would Router Do? • Router mechanisms affecting congestion management • Congestion Signaling: • Drop, mark, • send explicit messages • Buffer management: • Which packets to drop? • When to signal congestion? • Scheduling • If multiple connections, which one’s packets to send at any given time? • FQ does not eliminate congestion it just manages the congestion
Congestion Signaling • Drops (we’ve covered) • In-band marking • One bit (congested or not): ECN • Multiple bits (how congested / how much available): XCP • Out-of-band notification • IP Source Quench • Problem: It sends more packets when things are congested… • Not widely used.
When to mark packets? • Drop-tail / FIFO: • When the buffer is full • The de-facto mechanism today • Very easy to implement • Keeps average queue length high • ½ full delay • Drawbacks of FIFO with Tail-drop • Buffer lock out by misbehaving flows • Synchronizing effect for multiple TCP flows • Burst or multiple consecutive packet drops • Bad for TCP fast recovery • Note relation to FIFO: a scheduling discipline, NOT a drop policy, but they’re often bundled
Active Queue Mgmt. w/RED • Explicitly tries to keep queue small • Low delay, but still high throughput under bursts • (This is “power”: throughput / delay) • Assumes that hosts respond to lost packets • Technique: • Randomization to avoid synchronization • (Recall that if many flows, don’t need as much buffer space!) • Drop before the queue is actually full • RED is “Random Early Detection” • Could mean marking, not dropping
Discard Probability 1 0 min_th max_th queue_len Average Queue Length RED • FIFO scheduling • Buffer management: • Discard probability is computed as a function of average queue length
RED • RED • Advantages • Absorb burst better • Avoids synchronization • Signal end systems earlier • Problems with RED • No protection: if a flow misbehaves it will hurt the other flows • Fair Queueing • Advantages: protection among flows • Misbehaving flows will not affect the performance of well-behaving flows • FIFO does not have such a property
UDP (#1) UDP (#1) TCP (#2) TCP (#2) . . . . . . TCP (#32) TCP (#32) 10 Mbps) Simulation Example Stateless solution: Random Early Detection • 1 UDP (10 Mbps) and 31 TCPs sharing a 10 Mbps link Stateful solution: Fair Queueing Our Solution: Core-Stateless Fair Queueing
RED parameter sensitivity • RED can be very sensitive to parameters • Tuning them is a bit of a black art! • One thing: “gentle” RED • max_p <= pb <= 1 as • maxthresh <= qa <= 2*maxthresh • instead of “cliff” effect. Makes RED more robust to choice of maxthresh, max_p • But note: Still must choose wq, minthresh… • RED is not very widely deployed, but testing against both RED and DropTail is very common in research, because it could be.
ECN : Explicit Congestion Notification • Explicit Congestion Notification • In IP-land marking • Router sets bit for congestion instead of dropping a packet • Receiver should copy bit from packet to ack • If bit set, react the same way as if it had been dropped (but you don’t have to retransmit or risk losing ACK clocking) • Sender reduces cwnd when it receives ack with marking • Where does it help? • Delay-sensitive apps, particularly low-bw ones • Small window scenarios
ECN • Some complexity: • How to send in legacy IP packets (IP ToS field) • Determining ECN support: two bits (one “ECN works”, one “congestion or not” • How to echo bits to sender (TCP header bit) • More complexity: Cheating! • Receiver can clear ECN bit • Solution: Multiple unmarked packet states • Sender uses multiple unmarked packet states • Router sets ECN mark, clearing original unmarked state • Receiver must either return ECN bit or guess nonce • More nonce bits less likelihood of cheating (1 bit is sufficient)
TCP Problems • When TCP congestion control was originally designed in 1988: • Key applications: FTP, E-mail • Maximum link bandwidth: 10Mb/s • Users were mostly from academic and government organizations (i.e., well-behaved) • Almost all links were wired (i.e., negligible error rate) • Thus, current problems with TCP: • High bandwidth-delay product paths • Wireless (or any high error links) • Selfish users
High Delay High Bandwidth Challenges: AIMD • TCP lacks fast response • In AIMD, spare bandwidth is available TCP increases • cwnd increases by 1 packet/ RTT even if spare bandwidth is huge • Time to reach 100% utilization is proportional to available bandwidth • e.g., 2 flows share a 10Gb/s link, one flow finishes available bandwidth is 5Gb/s • e.g., 5Gb/s available, 200ms RTT, 1460B payload 17,000s • If High Delay (RTT > 200 ms), need more time
High Delay High Bandwidth Challenges: Slow Start • TCP lacks fast response • Short TCP flows (majority) cannot acquire the spare bandwidth faster than “slow start” • In slow start, window increases exponentially • e.g., 10Gb/s, 200ms RTT, 1460B payload, assume no loss • Time to fill pipe: 18 round trips = 3.6 s • Throughput: 382MB / 3.6s = 850Mb/s • 8.5% utilization not very good • If High Delay (RTT > 200 ms), ? • Loose only one packet drop out of slow start into AIMD by 1 pkt/RTT (even worse), taking forever to grab the large bandwidth
TCP congestion control performs poorly as bandwidth or delay increases • Inefficient as bandwidth or delay increases [Low02] Avg. TCP Utilization Avg. TCP Utilization 50 flows in both directions Buffer = BW x Delay RTT = 80 ms 50 flows in both directions Buffer = BW x Delay BW = 155 Mb/s Bottleneck Bandwidth (Mb/s) Round Trip Delay (sec)
Solution: Decouple Congestion Control from Fairness • Congestion Control: High Utilization; Small Queues; Few Drops • Fairness: Bandwidth Allocation Policy Coupled because a single mechanism controls both Example: In TCP, Additive-Increase Multiplicative-Decrease (AIMD) controls both How does decoupling solve the problem? • To control congestion: use MIMD which shows fast response • To control fairness: use AIMD which converges to fairness
Why Current Approaches Don’t Use Expressive Feedback? Efficiency Problem: • Efficient link utilization needs expressive feedback • In coupled systems, expressive feedback led to per-flow state (Unscalable!) Solution: Use Decoupling • Decoupling looks at efficiency as a problem about aggregate traffic • Match aggregate traffic to link capacity and drain the queue • Benefits: No need for per-flow information
To make a decision, router needs state of this flow To make a decision, router needs state of all flows Put a flow’s state in its packets Unscalable Scalable Fairness Control Shuffle bandwidth in aggregate to converge to fair rates Router computes a flow’s fair rate explicitly
XCP: An eXplicit Control Protocol • Congestion Controller • Fairness Controller
Round Trip Time Round Trip Time Congestion Window Congestion Window Feedback Feedback How does XCP Work? Feedback = + 0.1 packet • Congestion Header: • RTT and congestion window are filled in by the sender and never modified in transit. • Feedback is initialized by the sender. Routers along the path modify this field.
Round Trip Time Congestion Window Feedback = + 0.1 packet How does XCP Work? Feedback = - 0.3 packet
How does XCP Work? Congestion Window = Congestion Window + Feedback XCP extends ECN and CSFQ Routers compute feedback without any per-flow state at Router (using flow’s state ‘RTT and congestion window’ in Congestion Header)
How Does an XCP Router Compute the Feedback (decoupling)? Congestion Controller Fairness Controller Goal:Divides between flows to converge to fairness Goal: Matches input traffic to link capacity & drains the queue Aggregate feedback Looks at aggregate traffic & queue Looks at a flow’s state (RTT and congestion window) in Congestion Header MIMD • Algorithm: • Aggregate traffic changes by ~ Spare Bandwidth • ~ - Queue Size So, = davg Spare - Queue AIMD Algorithm: If > 0 Divide equally between flows If < 0 Divide between flows proportionally to their current rates
Details Fairness Controller Congestion Controller Algorithm: If > 0 Divide equally between flows If < 0 Divide between flows proportionally to their current rates = davg Spare - Queue Theorem:System converges to optimal utilization (i.e., stable) for any link bandwidth, delay, number of sources if: Need to estimate number of flows N (Proof based on Nyquist Criterion) RTTpkt : Round Trip Time in header Cwndpkt : Congestion Window in header T: Counting Interval No Parameter Tuning No Per-Flow State at Router
Characteristics of Solution • Improved Congestion Control (in high bandwidth-delay & conventional environments): • Small queues • Almost no drops • Improved Fairness • Scalable (no per-flow state) • Flexible bandwidth allocation: min-max fairness, proportional fairness, differential bandwidth allocation,…
Simulations Show XCP is Better • Extensive Simulations • Compared with TCP over DropTail, RED, REM, AVQ, CSFQ • XCP: • Better utilization • Near-zero drops • Fairer • Efficient & robust to increase in bandwidth • Efficient & robust to increase in delay
S1 Bottleneck S2 R1, R2, …, Rn Sn Subset of Results Similar behavior over:
Utilization as a function of Bandwidth Utilization as a function of Delay Utilization Utilization Bottleneck Bandwidth (Mb/s) Round Trip Delay (sec) XCP Remains Efficient as Bandwidth or Delay Increases
Start 40 Flows Start 40 Flows Stop the 40 Flows Stop the 40 Flows XCP Shows Faster Response than TCP XCP shows fast response
XCP Deals Well with Short Web-Like Flows Average Utilization Average Queue Drops Arrivals of Short Flows/sec
(RTT is 40 ms 330 ms ) XCP is Fairer than TCP Same RTT Different RTT Avg. Throughput Avg. Throughput Flow ID Flow ID (all RTT = 40 ms )
XCP Summary • XCP • Outperforms TCP • Efficient for any bandwidth • Efficient for any delay • Scalable (no per flow state) • Benefits of Decoupling • Use MIMD for congestion control which can grab/release large bandwidth quickly • Use AIMD for fairness which converges to fair bandwidth allocation
XCP benefits & issues • Requires “policers” at edge if you don’t trust hosts to report cwnd/rtt correctly • Much like CSFQ… • Doesn’t provide much benefit in today’s common case • But may be very significant for tomorrow’s. • High bw*rtt environments (10GigE coming to a desktop near you…) • Short flows, highly dynamic workloads • Cool insight: Decoupled fairness and congestion control • Pretty big architectural change VCP ?