1 / 17

Implementing High Speed TCP (aka Sally Floyd’s)

Implementing High Speed TCP (aka Sally Floyd’s). Gareth Fairey & Yee-Ting Li 12 th September 2002 @ Brighton. What is High Speed TCP?. Changes the way TCP behaves at high speed (ie large cwnd) Standard TCP has two modes Slow start (not very slow…) Congestion Avoidance

jerome
Download Presentation

Implementing High Speed TCP (aka Sally Floyd’s)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Implementing High Speed TCP (aka Sally Floyd’s) Gareth Fairey & Yee-Ting Li 12th September 2002 @ Brighton

  2. What is High Speed TCP? • Changes the way TCP behaves at high speed (ie large cwnd) • Standard TCP has two modes • Slow start (not very slow…) • Congestion Avoidance • Focuses on Congestion Avoidance Region – ie when TCP knows (thinks it knows…) how well the network behaves… • BUT only when we are at high speeds, else do what normal Standard TCP does… • Readily deployable 1st step towards Equation Based Congestion Control

  3. What does it do? • Standard TCP uses two parameters • Increase parameter, a • Decrease parameter, b • i.e. AIMD( a,b ) • Standard TCP uses • a=1 • b=0.5 • High Speed TCP introduces • a->a(cwnd) • b->b(cwnd) • i.e. The value of a and b depends on the current congestion window size • If we increase a more with larger cwnd we can get back up to our ‘optimal’ cwnd size for the network path • If we decrease b less we don’t lose as much bandwidth due to a small congestion window

  4. What exactly does it do? • Based on the TCP response function • Relates loss and throughput • Uses the TCP response function to investigate certain parameters • High_Window, High_Loss; largest cwnd needed for x throughput and the required loss for that throughput • Low_Window, Low_Loss; smallest cwnd when we actually switch from Standard TCP and the required loss rate for that cwnd size • High_B; the smallest decrease in b when we are at a large cwnd • Equations to transform this information into a table for a(cwnd) and b(cwnd)

  5. Implementation of High Speed TCP • It was decided to make this a compile-time option, so a corresponding option was added to the existing kernel configuration set-up. • There turned out to be only a few changes necessary to make to the kernel source to implement this: • Code for calculating the a and b values. • The existing code for changing the congestion window size (cwnd), during the Congestion Avoidance phase only. • The following slides show some details of our initial implementation, done against kernel 2.4.16.

  6. Changing cwnd size. • From our inspection of the source, it became apparent that this only happens in the file net/ipv4/tcp_input.c, where each case is handled in a specific function: • Increasing the cwnd following receipt of an ACK happens in the function tcp_cong_avoid (this is where a will be used). • Decreasing the cwnd happens in the function tcp_cwnd_down (this is where b will be used). • On the following slides, I will describe those changes.

  7. Calculating suitable a and b values. • To find suitable a and b values for a given cwnd at run-time, we use a look-up table, which we populate as follows: • We defined a structure, hstcp_entry, (in the file include/net/tcp.h) to contain cwnd and the corresponding a_val and b_val. [Note: b is between 0 and 1, so it is scaled to be between 0 and 256 and that value stored instead]. • For a selection of different congestion window sizes covering the expected range, we calculated the corresponding a_val and b_val. • We defined an array (in the file net/ipv4/tcp_input.c) to contain these hstcp_entrys , ordered by cwnd. • For a given cwnd, since the entries are stored in order in the table, we can use binary search to find a suitable hstcp_entry from it. This is done in the function get_hstcp_entry, defined in the file net/ipv4/tcp_input.c

  8. Changes to tcp_cong_avoid • To achieve additive increase of cwnd during Congestion Avoidance, the TCP needs to receive enough ACKS for a full congestion window before cwnd is incremented. • In the Linux kernel this is achieved by counting the ACKs since the last change of cwnd and only incrementing it when this counter exceeds cwnd. • In High Speed TCP, cwnd would be increased by a instead; alternatively, it could be incremented a times as often while ACKs are received.

  9. Changes to tcp_cong_avoid • Original loop code: static inline void tcp_cong_avoid(struct tcp_opt *tp) { if (tp->snd_cwnd <= tp->snd_ssthresh) { /* In "safe" area, increase. */ if (tp->snd_cwnd < tp->snd_cwnd_clamp) tp->snd_cwnd++; } else { /* In dangerous area, increase slowly. * In theory this is tp->snd_cwnd += 1 / tp->snd_cwnd */ if (tp->snd_cwnd_cnt >= tp->snd_cwnd) { if (tp->snd_cwnd < tp->snd_cwnd_clamp) tp->snd_cwnd++; tp->snd_cwnd_cnt=0; } else tp->snd_cwnd_cnt++; } tp->snd_cwnd_stamp = tcp_time_stamp; }

  10. Changes to tcp_cong_avoid • Changed loop code: static inline void tcp_cong_avoid(struct tcp_opt *tp) { if (tp->snd_cwnd <= tp->snd_ssthresh) { /* In "safe" area, increase. */ if (tp->snd_cwnd < tp->snd_cwnd_clamp) tp->snd_cwnd++; } else { /* In dangerous area, increase slowly. * In theory this is tp->snd_cwnd += 1 / tp->snd_cwnd */ if ((tp->snd_cwnd_cnt * get_hstcp_val(tp->snd_cwnd).a_val) >= tp->snd_cwnd) { if (tp->snd_cwnd < tp->snd_cwnd_clamp) tp->snd_cwnd++; tp->snd_cwnd_cnt=0; } else tp->snd_cwnd_cnt++; } tp->snd_cwnd_stamp = tcp_time_stamp; }

  11. Changes to tcp_cwnd_down • Once a congestion event has occurred, the cwnd is reduced to adapt to the observed state of the network. • Traditionally, it is halved; • With High Speed TCP, it is proposed that the proportion of the decrease will depend on cwnd.

  12. Changes to tcp_cwnd_down • Original source: static void tcp_cwnd_down(struct tcp_opt *tp) { int decr = tp->snd_cwnd_cnt + 1; tp->snd_cwnd_cnt = decr&1; decr >>= 1; if (decr && tp->snd_cwnd > tp->snd_ssthresh/2) tp->snd_cwnd -= decr; tp->snd_cwnd = min(tp->snd_cwnd, tcp_packets_in_flight(tp)+1); tp->snd_cwnd_stamp = tcp_time_stamp; }

  13. Changes to tcp_cwnd_down • Changed source: static void tcp_cwnd_down(struct tcp_opt *tp) { int decr = tp->snd_cwnd_cnt + 1; tp->snd_cwnd_cnt = decr&1; decr = (int)((decr * get_hstcp_val(tp->snd_cwnd).b_val)>>8); if (decr && tp->snd_cwnd > tp->snd_ssthresh/2) tp->snd_cwnd -= decr; tp->snd_cwnd = min(tp->snd_cwnd, tcp_packets_in_flight(tp)+1); tp->snd_cwnd_stamp = tcp_time_stamp; }

  14. Initial results • Implemented on a P3 450Mhz • NOT Gigabit • Patched with Web100 alpha 1.2 • Conducted tests since…. Yesterday! • WAN tests from UCL to RAL, CERN, Daresbury, Amsterdam & Manchester • Early results… Basic Analysis

  15. Throughput Comparison

  16. Web 100 Cwnd Growth

  17. What next? • Need to develop an ‘advanced test program’ to fully explore the HSTCP performance space • GUY has thorough Network Simulator Analysis of stock HSTCP – need to compare results • Need to explore the parameter space with different values of Low_Loss, Low_Window; High_Window, High_Loss; High_B • Implement /proc hooks to enable easy configuration of HSTCP parameters • Investigate into performance issues on hosts of lookup table • More results! Especially on GigE. • Expand tests to America, esp. SLAC (high delay) • Investigate into fairness compared to other TCP implementations

More Related