The Internet’s architecture for managing congestion

The Internet’s architecture for managing congestion Damon Wischik, UCL www.wischik.com/damon

Some Internet History • 1974: First draft of TCP/IP[“A protocol for packet network interconnection”, Vint Cerf and Robert Kahn] • 1983: ARPANET switches on TCP/IP • 1986: Congestion collapse • 1988: Congestion control for TCP[“Congestion avoidance and control”, Van Jacobson] “A Brief History of the Internet”, the Internet Society

End-to-end control Internet congestion is controlled by the end-systems.The network operates as a dumb pipe.[“End-to-end arguments in system design” by Saltzer, Reed, Clark, 1981] request Server (TCP) User

End-to-end control Internet congestion is controlled by the end-systems.The network operates as a dumb pipe.[“End-to-end arguments in system design” by Saltzer, Reed, Clark, 1981] Server (TCP) User data

End-to-end control Internet congestion is controlled by the end-systems.The network operates as a dumb pipe.[“End-to-end arguments in system design” by Saltzer, Reed, Clark, 1981] The TCP algorithm, running on the server, decides how fast to send data. acknowledgements Server (TCP) User data

TCP if (seqno > _last_acked) { if (!_in_fast_recovery) { _last_acked = seqno; _dupacks = 0; inflate_window(); send_packets(now); _last_sent_time = now; return; } if (seqno < _recover) { uint32_t new_data = seqno - _last_acked; _last_acked = seqno; if (new_data < _cwnd) _cwnd -= new_data; else _cwnd=0; _cwnd += _mss; retransmit_packet(now); send_packets(now); return; } uint32_t flightsize = _highest_sent - seqno; _cwnd = min(_ssthresh, flightsize + _mss); _last_acked = seqno; _dupacks = 0; _in_fast_recovery = false; send_packets(now); return; } if (_in_fast_recovery) { _cwnd += _mss; send_packets(now); return; } _dupacks++; if (_dupacks!=3) { send_packets(now); return; } _ssthresh = max(_cwnd/2, (uint32_t)(2 * _mss)); retransmit_packet(now); _cwnd = _ssthresh + 3 * _mss; _in_fast_recovery = true; _recover = _highest_sent; } traffic rate [0-100 kB/sec] time [0-8 sec]

How TCP shares capacity individualflowbandwidths availablebandwidth sum of flowbandwidths time

Motivation: buffer size • Internet routers have buffers, to accomodate bursts in traffic. • How big do the buffers need to be? • 3 GByte? Rule of thumb—what Cisco does today • 300 MByte? [Appenzeller, Keslassy, McKeown, 2004 ] • 30 kByte? • Large buffers are unsustainable: • Data volumes double every 10 months • CPU speeds double every 18 months • Memory access speeds double every 10 years

U(x) x Motivation: TCP’s teleology[Kelly, Maulloo, Tan, 1998] • Consider several TCP flows sharing a single link • Let xr be the mean bandwidth of flow r[pkts/sec]Let y be the total bandwidth of all flows [pkts/sec]Let C be the total available capacity [pkts/sec] • TCP and the network act so as to solvemaximise årU(xr) - P(y,C) over xr0 where y=årxr P(y,C) y C

Bad teleology U(x) little extra valued attached to high-bandwidth flows severe penalty for allocating too little bandwidth x

Bad teleology flows with largeRTT are satisfied with little bandwidth U(x) flows with small RTT want more bandwidth x

Bad teleology P(y,C) no penalty unlesslinks are overloaded y C

U(x) x TCP’s teleology • The network acts as if it’s trying tosolve an optimization problem • Is this what we want the Internet to optimize? • Does it even succeed in performing the optimization? P(y,C) y C

Desynchronized TCP flows: aggregate traffic is smooth network solves the optimization SynchronizedTCP flows: aggregate traffic is bursty network oscillates about the optimum + + = Synchronization + individualflow rates + = aggregatetraffic rate time

Desynchronized TCP flows: aggregate traffic is smooth network solves the optimization SynchronizedTCP flows: aggregate traffic is bursty network oscillates about the optimum Synchronization + + individualflow rates + + = = aggregatetraffic rate time

TCP traffic model • When there are many TCP flows, the aggregate traffic rate xt varies smoothly, according to a differential equation[Misra, Gong, Towsley, 2000] • The equation involves • pt, the packet loss probability at time t, • RTT, the average round trip time aggregatetraffic rate desynchronized synchronized time

TCP if (seqno > _last_acked) { if (!_in_fast_recovery) { _last_acked = seqno; _dupacks = 0; inflate_window(); send_packets(now); _last_sent_time = now; return; } if (seqno < _recover) { uint32_t new_data = seqno - _last_acked; _last_acked = seqno; if (new_data < _cwnd) _cwnd -= new_data; else _cwnd=0; _cwnd += _mss; retransmit_packet(now); send_packets(now); return; } uint32_t flightsize = _highest_sent - seqno; _cwnd = min(_ssthresh, flightsize + _mss); _last_acked = seqno; _dupacks = 0; _in_fast_recovery = false; send_packets(now); return; } if (_in_fast_recovery) { _cwnd += _mss; send_packets(now); return; } _dupacks++; if (_dupacks!=3) { send_packets(now); return; } _ssthresh = max(_cwnd/2, (uint32_t)(2 * _mss)); retransmit_packet(now); _cwnd = _ssthresh + 3 * _mss; _in_fast_recovery = true; _recover = _highest_sent; } traffic rate [0-100 kB/sec] time [0-8 sec]

Queue model • How does packet loss probability ptdepend on buffer size? • There are two families of answers, depending on queueing delay: • Small buffers (queueing delay «RTT) • Large buffers (queueing delay RTT)

Small buffers As the optical fibre’s line rate increases • queue size fluctuates more and more rapidly • queue size distribution does not change(it depends only on link utilization, not on line rate) queueing delay19 ms queueing delay1.9 ms queueing delay0.19 ms queue size[0-15 pkt] time [0-5 sec]

Large buffers (queueing delay 200 ms) • When xt<Cthe queue size is small (C=line rate) • No packet drops, so TCP increases xt queue size[0-160 pkt] time [0-10 sec]

Large buffers (queueing delay 200 ms) • When xt<Cthe queue size is small (C=line rate) • No packet drops, so TCP increases xt • When xt>C the queue fills up and packets begin to get dropped queue size[0-160 pkt] time [0-10 sec]

Large buffers (queueing delay 200 ms) • When xt<Cthe queue size is small (C=line rate) • No packet drops, so TCPs increases xt • When xt>C the queue fills up and packets begin to get dropped • TCPs may ‘overshoot’, leading to synchronization queue size[0-160 pkt] time [0-10 sec]

Large buffers (queueing delay 200 ms) • Drop probability depends onboth traffic rate xt and queue size qt queue size[0-160 pkt] time [0-10 sec]

Analysis • Write down differential equations • for aggregate TCP traffic rate xt • for queue dynamics and loss prob pttaking account of buffer size • Calculate • average link utilization • average queue occupancy/delay • extent of synchronizationand consequent loss of utilization, and jitter[Gaurav Raina, PhD thesis, 2005]

Stability/instability analysis • For some values of C*RTT, the dynamical system is stable • For others it is unstable and there are oscillations(i.e. the flows are partially synchronized) • When it is unstable, we can calculate the amplitude of the oscillations trafficrate xt/C time

Instability plot traffic intensity x/C extent ofoscillationsin x/C TCP throughput equation log10 ofpkt lossprobability p queue equation

Instability plot traffic intensity x/C C*RTT=4pkts log10 ofpkt lossprobability p C*RTT=20 pkts C*RTT=100 pkts

Alternative buffer-sizing rules Intermediate buffers buffer = bandwidth*delay / sqrt(#flows)orLarge buffers buffer = bandwidth*delay Large buffers with AQM buffer=bandwidth*delay*{¼,1,4} Small buffers buffer={10,20,50} pkts Small buffers, ScalableTCP buffer={50,1000} pkts[Vinnicombe 2002] [T.Kelly 2002]

Conclusion • The network acts to solve an optimization problem. • We can choose which optimization problem, by choosing the right buffer size & by changing TCP’s code. • It may or may notattain the solution • In order to make sure the network is stable,we need to choose the buffer size & TCP code carefully.

Prescription • ScalableTCP in end-systemsneed to persuade Microsoft, Linus • Much smaller buffers in routersneed to persuade BT/AT&T ScalableTCP gives more weight to high-bandwidth flows. And it’s been shown to be stable. With small buffers,the network likes to run with slightly lower utilization, hence lower delay P(y,C) U(x) x y C

The Internet’s architecture for managing congestion

The Internet’s architecture for managing congestion

Presentation Transcript

The Internet, Intranets, and Extranets

Internet

Software Architecture

Distributed Systems Architecture Presentation II

Congestion Control and Traffic Management in High Speed Networks

Measurement, Modeling, and Analysis of the Internet: Part II

IP Quality of Service

Microscopic Behavior of Internet Control

Software Architecture

Davis Social Links

Computer Networks (Graduate level)

Managing Your References with Ref Works 2.0 Your personal reference database @Internet

-Early Islamic Architecture- -Moorish Architecture-

Congestion Control

Internet

Chapter 8 Communication Networks and Services

Introduction of Revit Architecture, Structure, and System