240 likes | 381 Views
Is TCP really TCP-Friendly ? Does the Internet have a reasonable congestion control paradigm?. Michael B. Greenwald University of Pennsylvania. What is congestion?. Applications/clients present a larger aggregate load than intermediate nodes in the network can absorb.
E N D
Is TCP really TCP-Friendly?Does the Internet have a reasonable congestion control paradigm? Michael B. Greenwald University of Pennsylvania
What is congestion? • Applications/clients present a larger aggregate load than intermediate nodes in the network can absorb. Can cause excessive delay, reduced throughput (if uncontrolled it can cause congestion collapse).
What causes congestion? (Isn’t bandwidth cheap?) • Persistent congestion: solved by adequate provisioning • Intermittent high load • Intermittent emergency (earthquake) • Extreme loads (expected: Mother’s day. Unexpected: Pathfinder pictures, Victoria’s Secret) • Periods of growth • Congestion: • Bursty traffic (statistical multiplexing) • Many sources converge on a single link • Low capacity link becomes bottleneck • subset of multicast destinations Congestion (will) still occur(s), even though bandwidth is getting cheaper
Why must congestion be controlled? • Congestion collapse • Links clogged with useless packets: that will be dropped anyway, or are retransmissions, or are out-of-date • Long Delay (for short (1 pkt) transactions over long distances. Rare?) • High variability in delay (jitter) • High drop rate? (Not a problem in itself, since packets only dropped if can’t make it through bottleneck anyway, but problem since: • Use up bandwidth on other links before being dropped. • Control over which packets get dropped? • Low Utilization (inefficiency) • Fairness
Overview of Talk • The accepted framework/constraints for a solution • I demonstrate that it is certainly not Optimal, and maybe not even Good • Lots of Bad Ideas taken together can be a Good Idea. (yeah, right)
Rules of the GameProperties • end-to-end • packets may take different paths • scale to gazillions of hosts • network is stateless (or soft-state) • no reservations • packets, not connections • fast-path of switches must not be compromised
Rules of the GameReligion • windows (vs. rate) [conservation of packets, oscillations] • work conserving • Congestion notification cannot generate traffic (Oh! Those source-quenches).
Accepted Solutions • Slow Start, congestion avoidance (and variants) • RED (and variants) • ECN (must be compatible w/packet loss): shape still unclear
Slow-Start andCongestion Avoidance • Due to Van Jacobson • Exponential decrease (of window size) on congestion, linear increase to probe for excess capacity • Exponential probe until reach threshold (initial, or 1/2 previous max. congestion window)
RED:Random Early Drop/Detection • Avoid burst drop • Increase fairness • Penalty box • Approximate ideal (weighted fair queuing)
ECN: Explicit Congestion Notification • Why not introduce signals before you need to lose packets? Avoid packet loss • Why not explicitly specify load (and extra capacity)? Avoid searching • However, since ECN packets may get lost under congestion, packet loss must still be a congestion signal
Why is this not ideal?Accepted quibbles • Non-TCP • Packet loss due to errors (e.g. wireless) considered congestion signal • Bad RTE can also cause false signals • “Mice” (congestion control only kicks in after 6 packets or so) • Fairness • QOS • self-similarity of traffic • Buffer occupancy
Why is this not ideal?Fundamental flaws • If this is perfect, then why …. ? (assume previous problems solved, and can get nice slow-start curve) • Aggregated small flows do not exponentially decrease or linearly increase. • Extreme case: • 1,000,000 flows with window size of < 5 • Congestion notifies 10% of the flows, decrease of < 500,000 packets • Regardless each of 1,000,000 flows increases cwnd by 1 each RTT.
Why is this not ideal?Fundamental flaws • If this is perfect, then why …. ? • Multi-hop paths, neighboring routers with large buffers not implement slow start themselves
Conjecture: TCP is a local maximum with very steep slopes • Most new ideas, taken by themselves, make matters worse than Standard TCP. Bad drawing of small mountain with steep cliffs, really large mountain in distance
Some locally bad ideas • Rate-based congestion control • unbounded input, oscillatory • Hop by hop feedback • Head of line blocking • Local, so can’t achieve global fairness • Aggregation • Fractal nature of traffic • Explicit out-of-band congestion notification packets • adds to load under congestion, wastes bandwidth and unstable
But taken together … the shape of things to come? • Rate-control: • If exceed bound w/o feedback, halt transmission • Meaningful across multiple hops and independent of RTT
SIRPENT/VIPER/VASSA/<no-name> • Endnodes: • (Periodically) Request to send at rate/QOS (function of price) • Response is allowed rate • Guardian ensures compliance (no impact on internal nodes) • Occasional pushback from guardian with new rate and quiet period (empty buffer). • Slow increase in rate (function of RTT to guardian).
SIRPENT/VIPER/VASSA/<no-name> • Switches • Aggregate all packets into flows: • src - nexthop - nexthop+1 - QOS - congested? • Packet counters (hardware?) [Flow lookup no worse than routing, but N^2 size table] • Subscription packets periodically weight aggregated flows: sum of weights on output link = 1. (details: take into account rate-limited flows, excess capacity etc.) • Stamp subscription packet with min (current, local weight) (as a percentage of request). • If no congestion, just send FIFO • If queue buildup on output, input, or internal bus, then divide available capacity by subscription weight and pushback on neighbors --- iff count (time-weighted to rate) exceeds fair-share.
SIRPENT/VIPER/VASSA/<no-name> • Switches (continued) • Treat pushback exactly as if hardware limitation (and recurse) • On re-subscription guardian enforces congestion bit on source.
SIRPENT/VIPER/VASSA/<no-name> • Scales as local topology • Utilization 60% on rate-controlled 90+% for filler (e.g. email). • Under no congestion, no overhead. • Only impact on fast-path is packet counting (needed for accounting, anyway?) • Works for multicast, UDP, etc.
For the faint of heart: Minor modifications to ECN • “ECN with indication of # of flows” --- try to make backoff exponential. • Don’t try to count flows: not scalable • Measure variation from classical sawtooth, and feedback rate. • Actually, key seems to be to limit increase. Backoff takes care of itself, eventually.
Not quite all the details/bugs worked out • Unstable when try to allocate more than 60-70% of the capacity (but can fill with best effort) • To work best, return path for subscription needs to explicitly match forward path. • To work best, packets in a single flow need to take the same path.
Conclusions • Existing model is flawed (everyone knows that) • Flaws are fundamental (the Good Guys disagree) • Right approach involves major changes (reconsider some bad ideas) • Still a few bugs in the system…. …. Not ready for prime time.