1 / 30

The Internet’s architecture for managing congestion

The Internet’s architecture for managing congestion. Damon Wischik, UCL www.wischik.com/damon. Some Internet History. 1974: First draft of TCP/IP [ “A protocol for packet network interconnection” , Vint Cerf and Robert Kahn ] 1983: ARPANET switches on TCP/IP 1986: Congestion collapse

Download Presentation

The Internet’s architecture for managing congestion

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Internet’s architecture for managing congestion Damon Wischik, UCL www.wischik.com/damon

  2. Some Internet History • 1974: First draft of TCP/IP[“A protocol for packet network interconnection”, Vint Cerf and Robert Kahn] • 1983: ARPANET switches on TCP/IP • 1986: Congestion collapse • 1988: Congestion control for TCP[“Congestion avoidance and control”, Van Jacobson] “A Brief History of the Internet”, the Internet Society

  3. End-to-end control Internet congestion is controlled by the end-systems.The network operates as a dumb pipe.[“End-to-end arguments in system design” by Saltzer, Reed, Clark, 1981] request Server (TCP) User

  4. End-to-end control Internet congestion is controlled by the end-systems.The network operates as a dumb pipe.[“End-to-end arguments in system design” by Saltzer, Reed, Clark, 1981] Server (TCP) User data

  5. End-to-end control Internet congestion is controlled by the end-systems.The network operates as a dumb pipe.[“End-to-end arguments in system design” by Saltzer, Reed, Clark, 1981] The TCP algorithm, running on the server, decides how fast to send data. acknowledgements Server (TCP) User data

  6. TCP if (seqno > _last_acked) { if (!_in_fast_recovery) { _last_acked = seqno; _dupacks = 0; inflate_window(); send_packets(now); _last_sent_time = now; return; } if (seqno < _recover) { uint32_t new_data = seqno - _last_acked; _last_acked = seqno; if (new_data < _cwnd) _cwnd -= new_data; else _cwnd=0; _cwnd += _mss; retransmit_packet(now); send_packets(now); return; } uint32_t flightsize = _highest_sent - seqno; _cwnd = min(_ssthresh, flightsize + _mss); _last_acked = seqno; _dupacks = 0; _in_fast_recovery = false; send_packets(now); return; } if (_in_fast_recovery) { _cwnd += _mss; send_packets(now); return; } _dupacks++; if (_dupacks!=3) { send_packets(now); return; } _ssthresh = max(_cwnd/2, (uint32_t)(2 * _mss)); retransmit_packet(now); _cwnd = _ssthresh + 3 * _mss; _in_fast_recovery = true; _recover = _highest_sent; } traffic rate [0-100 kB/sec] time [0-8 sec]

  7. How TCP shares capacity individualflowbandwidths availablebandwidth sum of flowbandwidths time

  8. Motivation: buffer size • Internet routers have buffers, to accomodate bursts in traffic. • How big do the buffers need to be? • 3 GByte? Rule of thumb—what Cisco does today • 300 MByte? [Appenzeller, Keslassy, McKeown, 2004 ] • 30 kByte? • Large buffers are unsustainable: • Data volumes double every 10 months • CPU speeds double every 18 months • Memory access speeds double every 10 years

  9. U(x) x Motivation: TCP’s teleology[Kelly, Maulloo, Tan, 1998] • Consider several TCP flows sharing a single link • Let xr be the mean bandwidth of flow r[pkts/sec]Let y be the total bandwidth of all flows [pkts/sec]Let C be the total available capacity [pkts/sec] • TCP and the network act so as to solvemaximise årU(xr) - P(y,C) over xr0 where y=årxr P(y,C) y C

  10. Bad teleology U(x) little extra valued attached to high-bandwidth flows severe penalty for allocating too little bandwidth x

  11. Bad teleology flows with largeRTT are satisfied with little bandwidth U(x) flows with small RTT want more bandwidth x

  12. Bad teleology P(y,C) no penalty unlesslinks are overloaded y C

  13. U(x) x TCP’s teleology • The network acts as if it’s trying tosolve an optimization problem • Is this what we want the Internet to optimize? • Does it even succeed in performing the optimization? P(y,C) y C

  14. Desynchronized TCP flows: aggregate traffic is smooth network solves the optimization SynchronizedTCP flows: aggregate traffic is bursty network oscillates about the optimum + + = Synchronization + individualflow rates + = aggregatetraffic rate time

  15. Desynchronized TCP flows: aggregate traffic is smooth network solves the optimization SynchronizedTCP flows: aggregate traffic is bursty network oscillates about the optimum Synchronization + + individualflow rates + + = = aggregatetraffic rate time

  16. TCP traffic model • When there are many TCP flows, the aggregate traffic rate xt varies smoothly, according to a differential equation[Misra, Gong, Towsley, 2000] • The equation involves • pt, the packet loss probability at time t, • RTT, the average round trip time aggregatetraffic rate desynchronized synchronized time

  17. TCP if (seqno > _last_acked) { if (!_in_fast_recovery) { _last_acked = seqno; _dupacks = 0; inflate_window(); send_packets(now); _last_sent_time = now; return; } if (seqno < _recover) { uint32_t new_data = seqno - _last_acked; _last_acked = seqno; if (new_data < _cwnd) _cwnd -= new_data; else _cwnd=0; _cwnd += _mss; retransmit_packet(now); send_packets(now); return; } uint32_t flightsize = _highest_sent - seqno; _cwnd = min(_ssthresh, flightsize + _mss); _last_acked = seqno; _dupacks = 0; _in_fast_recovery = false; send_packets(now); return; } if (_in_fast_recovery) { _cwnd += _mss; send_packets(now); return; } _dupacks++; if (_dupacks!=3) { send_packets(now); return; } _ssthresh = max(_cwnd/2, (uint32_t)(2 * _mss)); retransmit_packet(now); _cwnd = _ssthresh + 3 * _mss; _in_fast_recovery = true; _recover = _highest_sent; } traffic rate [0-100 kB/sec] time [0-8 sec]

  18. Queue model • How does packet loss probability ptdepend on buffer size? • There are two families of answers, depending on queueing delay: • Small buffers (queueing delay «RTT) • Large buffers (queueing delay RTT)

  19. Small buffers As the optical fibre’s line rate increases • queue size fluctuates more and more rapidly • queue size distribution does not change(it depends only on link utilization, not on line rate) queueing delay19 ms queueing delay1.9 ms queueing delay0.19 ms queue size[0-15 pkt] time [0-5 sec]

  20. Large buffers (queueing delay 200 ms) • When xt<Cthe queue size is small (C=line rate) • No packet drops, so TCP increases xt queue size[0-160 pkt] time [0-10 sec]

  21. Large buffers (queueing delay 200 ms) • When xt<Cthe queue size is small (C=line rate) • No packet drops, so TCP increases xt • When xt>C the queue fills up and packets begin to get dropped queue size[0-160 pkt] time [0-10 sec]

  22. Large buffers (queueing delay 200 ms) • When xt<Cthe queue size is small (C=line rate) • No packet drops, so TCPs increases xt • When xt>C the queue fills up and packets begin to get dropped • TCPs may ‘overshoot’, leading to synchronization queue size[0-160 pkt] time [0-10 sec]

  23. Large buffers (queueing delay 200 ms) • Drop probability depends onboth traffic rate xt and queue size qt queue size[0-160 pkt] time [0-10 sec]

  24. Analysis • Write down differential equations • for aggregate TCP traffic rate xt • for queue dynamics and loss prob pttaking account of buffer size • Calculate • average link utilization • average queue occupancy/delay • extent of synchronizationand consequent loss of utilization, and jitter[Gaurav Raina, PhD thesis, 2005]

  25. Stability/instability analysis • For some values of C*RTT, the dynamical system is stable • For others it is unstable and there are oscillations(i.e. the flows are partially synchronized) • When it is unstable, we can calculate the amplitude of the oscillations trafficrate xt/C time

  26. Instability plot traffic intensity x/C extent ofoscillationsin x/C TCP throughput equation log10 ofpkt lossprobability p queue equation

  27. Instability plot traffic intensity x/C C*RTT=4pkts log10 ofpkt lossprobability p C*RTT=20 pkts C*RTT=100 pkts

  28. Alternative buffer-sizing rules Intermediate buffers buffer = bandwidth*delay / sqrt(#flows)orLarge buffers buffer = bandwidth*delay Large buffers with AQM buffer=bandwidth*delay*{¼,1,4} Small buffers buffer={10,20,50} pkts Small buffers, ScalableTCP buffer={50,1000} pkts[Vinnicombe 2002] [T.Kelly 2002]

  29. Conclusion • The network acts to solve an optimization problem. • We can choose which optimization problem, by choosing the right buffer size & by changing TCP’s code. • It may or may notattain the solution • In order to make sure the network is stable,we need to choose the buffer size & TCP code carefully.

  30. Prescription • ScalableTCP in end-systemsneed to persuade Microsoft, Linus • Much smaller buffers in routersneed to persuade BT/AT&T ScalableTCP gives more weight to high-bandwidth flows. And it’s been shown to be stable. With small buffers,the network likes to run with slightly lower utilization, hence lower delay P(y,C) U(x) x y C

More Related