1 / 51

TCP Round Trip Time Analysis in a University Network.

TCP Round Trip Time Analysis in a University Network. Justifying the pursuit of Active Queue Management (AQM) research. Author: Jonathan Thyer September 2004 - March 2005. Disclaimers. This work used to be a thesis… And then reality sunk in…

lihua
Download Presentation

TCP Round Trip Time Analysis in a University Network.

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. TCP Round Trip Time Analysis in a University Network. Justifying the pursuit of Active Queue Management (AQM) research. Author: Jonathan Thyer September 2004 - March 2005

  2. Disclaimers • This work used to be a thesis… • And then reality sunk in… • Just one individual in a talented research community  hoping to make a small contribution. • Wonderful family and busy work life. • This really is a mixture between a thesis and project presentation. • Now – have I lowered your expectations enough? 8-)

  3. How do you communicate across the Internet? • Access your local network • Use a well defined communications protocol • Ie: HTTP (the web) • Email • Type in some Internet destination address and away you go! • lunatic@thyer.org • http://www.thyer.org/ • But communications are not as efficient as they could be. • Assertion: The Internet is under-performing!

  4. What is happening under the computer covers? • Your Internet destination name address gets translated into a 32 bit number by a network service called the Domain Name System (DNS) • Your computer initiates communication with the destination Internet address. • Numerous Internet protocol routers, switches, hubs and physical media carry your communications from source to destination and back again.

  5. Communications Protocols • Open Systems Interconnection (OSI) model. • Communication protocols are defined in a layered application programming interface. • Why? Because it is easy to understand and to programmatically implement! • Layer 7: Applications (Web browser, HTTP etc) • Layer 6: Presentation layer (data conversions) • Layer 5: Session establishment (not communication orientated) • Layer 4: Transport protocol (often UDP/TCP) • Layer 3: Internet Protocol (logical addresses) • Layer 2: Data link layer - framing characteristics (often Ethernet) • Layer 1: Physical (radio frequency) characteristics

  6. Data link layer (layer 2) • Data can be sent between local area network devices at layer 2. • Data is broken down into smaller chunks of data called packets. • Different data link transmission protocols can be used. • Ethernet has become the common standard and uses 48-bit (6 bytes) source and destination addresses. • Data link layer communications are confined to local area networks through either point to point or shared media links. • Typically less than 1000 devices in a local area network. (often less than 255)

  7. Internet Protocol (layer 3) • Known as the logical layer (32 bit source/destination addresses) • Number addresses have a system called “Domain Name Service” that converts numbers to names. • Eg: 152.13.2.96 = www.uncg.edu • Data is also transported in packet form but can be routed between multiple local area networks. • A protocol called “Address Resolution Protocol” (ARP) translates IP (layer 3) addresses into layer 2 Ethernet addresses. • ARP is the glue between layer 3 and layer 2.

  8. Computer 1  Computer 2 • C#1 sends ARP request – who has 192.168.1.2? • C#2 replies – thats me and supplies 48-bit addr. • C#1 addresses data to C#2 using the supplied 48-bit address and sends it.

  9. Computer 1  Computer 4 • C#1 knows that C#4 is not in local network. • How? C#1 uses a mathematical masking operation by performing a logical AND operation on the destination IP address. • C#1 sends ARP – who has 192.168.1.254? – router replies with 48-bit address • C#1 sends data to router, router then looks in route tables for destination logical address. • Router sends ARP into destination address – who has 192.168.99.1? C#4 replies – thats me!!!

  10. What is a router? • A router is a device that operates at the OSI logical layer 3. • It knows what to do with data arriving that has logical IP addresses for source and destination. • A router builds routing tables to represent “networks” that are either directly connected or available through a neighboring router. • A router is designed to find the shortest network path between a source network and destination network. • A router often has multiple different physical links connected to it. There are often multiple possible routes to any specific network.

  11. Transport (layer 4) • User Datagram Protocol (UDP) – a stateless and connectionless protocol. • UDP packets get sent directly from source to destination and there is no possible way for the source to know that the data arrives intact. • Transport Control Protocol (TCP) – a stateful and connection oriented protocol. • TCP data is sent in segments. • A positive acknowledgement must be received for each segment sent. • TCP is the majority carrier of traffic on the Internet. • Why? • It is reliable – guaranteed delivery of all data content. • Validated over time and widely implemented. • First proposed in 1981 by John Postel. (RFC-793)

  12. TCP – Transmission and congestion control

  13. Internet – Old Days

  14. Internet – More recent times..

  15. Buffers / Queues everywhere

  16. Round Trip Time (RTT) • RTT is the time elapsed between when a TCP data segment is sent and that segments corresponding acknowledgement (ACK) is received. • RTT is an important measure of Internet performance. • RTT directly impacts TCP performance characteristics on end systems. • RTT is impacted by router’s along the communication path.

  17. My Goals • Develop a tool to measure TCP RTT data between the UNCG campus and the Internet • Produce frequency plots of the RTT data collected • Why? Because I had to do something to prove what Shan Suthaharan was telling me! • Try and explain the results. • Build a small network to perform further research within. • The tool developed is called tcpflowstat

  18. Data Collection Setup • NCREN: North Carolina Education and Research Network • UNCG to NCREN link averages about 60 – 80 m/bits per second over time. • Common “port spanning” method used to “copy” all Internet data to collection host • Collection host uses a program called “tcpdpriv” to collect the data. • Collected 100,000,000 packet samples over several days.

  19. Ethical Concerns • “tcpdpriv” does a number of things to change the data while preserving traffic characteristics • Source and Destination addresses are replaced with incrementing 32-bit numbers starting from 0. • TCP port information is replaced with random numbers. • Data content section of packet is discarded. • Packet header is stored to a file in “PCAP” format. • PCAP is a public domain packet header capture format for UNIX systems.

  20. Definition of a TCP Flow • A unique, reliable communication between a source and destination computer using the TCP protocol. • Think of dialing an office phone number, then using an extension number after that. • The phone number would be the destination IP address, then the extension becomes the TCP socket or port number. • A TCP flow is defined as the five-tuple of TCP protocol, source IP address, destination IP address, source TCP port, and destination TCP port. • There can be multiple TCP flows between a source and destination computer.

  21. RTT – How to calculate it! • From research literature, there are three basic calculation methods • Subtract the time difference between the TCP SYN packet and the resulting ACK of that SYN. • Use the change in window size during slow start – calculate the difference between data segment inter-arrival times… Uses a time threshold to determine a flight (burst) of packets. • Use a fluid dynamic view treating traffic at a bits level per unit time. Basis is that when TCP is in congestion avoidance mode, the window size increases by one MSS every RTT.

  22. RTT – with limited resources… • Related research shows that the SYN – SYN/ACK method is a reasonably good estimator of RTT. • Other methods depend on averaging several hundred values per TCP flow of communication. • I had only limited computing power available!

  23. Basic operation of the tcpflowstat program • Open’s a packet capture file • For each packet header in the file • Find a TCP packet • If the packet is a SYN packet, allocate a tcpflow data structure node and use the IP and TCP port addressing as the key item. • If not, search the tcpflow data structure to see if this packet matches an existing flow • If the packet is a SYN-ACK, then calculate time difference and update RTT data value in flow data structure. • If the packet is a FIN or RST packet, then the flow is removed from the data structure and placed in a “completed flow” linked list.

  24. Challenges to overcome • Each TCP flow detected (by seeing a SYN) forces the code to allocate memory for a tcpflow node. • Each TCP packet potentially results in a search of the tcpflow data structure • Data structures must be efficient.

  25. The tcpflow data structure • I chose to use a hash table to implement the data structure. • Hash table size was set to a large prime number not close to a power of 2. • In this case, the number was set to 47,189 • about halfway between 2^15 and 2^16 • This ensures fairly even distribution of hashing keys in the table.

  26. RTT – A word about time! • The UNIX PCAP library code stores packets with a millisecond and nanosecond time resolution. • Time delay may be introduced due to processing the packet header on the data collection host. • The nanosecond portion of the time stamp was multiplied by 1000 and added to the millisecond portion to bring the measurements into a millisecond time unit.

  27. Final processing • When all packets have been read and processed, the final steps are as follows • Sort flow duration data, then calculate the min, max, mean, and median • Sort flow RTT data, then calculate the min, max, mean, and median • Calculate the flow RTT frequency and output an RTT frequency file for gnuplot to process. • Output RTT, duration, and overall statistics to the screen.

  28. Tcpflowstat - code performance • Run on a Pentium III, 1Ghz CPU, and 512 Mb of RAM. • Processed 100 million packets of data in 30 minutes of elapsed time. • Approximately 7 – 10 Gbytes of data on disk to process. • Most time spent in waiting for disk activity, and memory management routines. • UNIX malloc code is notoriously inefficient (linear) especially when using the “free” routines.

  29. Performance cont… • Hash table exhibited collisions linearly from the upper bound. • Collision resolution was implemented through simple linked lists. • “tcpdpriv” sequential numbering of addresses created sequential hash keys (not too bad actually) • UNIX modulus function could be optimized • Large amount of RAM usage due to thousands of parallel TCP flows being processed within any time span. • A multiple indirect hashing approach would be better – ie: break the src/dest IP address down by octet. This is commonly implemented in routers.

  30. Initial Results TCP Flow Duration ------------------------- MIN = 2 ms MAX = 4794030 ms MEAN = 3593 ms MEDIAN = 758 ms ------------------------- TCP Round Trip Time (RTT) Overall Stats ------------------------- -------------------------- MIN = 0 ms TCP = 93455966 packets MAX = 63108 ms UDP = 6290127 packets MEAN = 221 ms ICMP = 192462 packets MEDIAN = 11 ms ------------------------- -------------------------- 802373 TCP flows counted. TOTAL = 100000000 packets ------------------------- --------------------------

  31. Observations on statistical breakdown • Over 90% of traffic is TCP. • A vast majority of flow durations are short (less than 1 second) • Likely due to web transactions which tend to be many and short. • Mean flow duration is higher than the mean. • A fair number of measured flows have longer durations. • Related research confirms that longer duration flows dominate the Internet traffic.

  32. Initial RTT Frequency Plot

  33. Observations • High frequency of TCP flows exhibiting RTT of less than 5 ms. • Significant percentage of TCP flows with RTT of approx. 20ms. • Peaks and valleys across the distribution • Skepticism thus: • Four more samples were taken over the period of about 1 week.

  34. Further combined sample results

  35. Combined RTT Frequencies – Loch Ness Monster Plot!

  36. Congested Router Diagram

  37. Possible explanations • Assuming that a router becomes a congestion point, a burst of traffic will cause queue overflow (droptail) • Global TCP congestion control synchronization will occur during queue overflow. • All affected TCP flows will synchronously reduce their Window size by 2. (multiplicative decrease) • Flows deeper in queue will not experience packet drop but will experience delay. • Flow treatment is not equal.

  38. What do we want to see…

  39. How to achieve a desired result • Only a router along the path knows its own congestion conditions at a point in time. • At high congestion times, we must ensure that there is no congestion control synchronization. • Random packet drop or marking (ECN) is appropriate to force a selection of flows to reduce their window sizes. • ECN is defined in RFC-3168 (borrows two bits from a reserved part of the header) • Queue size must be optimized to that flow delay is minimized.

  40. Random Early Detection (RED) [Floyd/Jacobson] • Two queue thresholds used, min_th and max_th. • When ave size < min_th, no packets marked • When ave size >= min_th, <= max_th, mark packets with probability p where p is a function of ave queue size. • When ave size > max_th, mark all packets

  41. RED is not always sufficient • Sudden congestion can keep queue depth above the maximum threshold. • RED can degenerate into the same behavior as a drop-tail configuration. • Weighted moving average algorithm reacts too slowly to sudden changes.

  42. Queue depth and router buffer sizes • 1994 paper (Villamizar and Song) set the standard router buffer size at • This is a commonly used formula today! • Subsequent paper at SIGCOMM 2004 from researchers at Stanford suggest the more appropriate formula is • Where C is the link capacity and n = no. of flows. • The denominator of this equation represents a variable that must be dynamic. It is the “predictor of congestion” variable. Shan Suthaharan is currently seeking a patent for predictive algorithms that determine this variable.

  43. Why not just reduce buffer size? • Reducing buffer size would likely reduce the incidence of delayed traffic flows. • Buffer overflow would still result in a TCP congestion control synchronization problem. • Still also have the problem of unfair treatment of flows – first come, first serve is not necessarily best. • Poor performance would still result.

  44. Conclusions • Maximum buffer size should be reduced as shown in SIGCOMM’04 paper  saves $ and eases hardware design concerns. • Active queue management (AQM) should be used. • A combination of reduced, preferably dynamic, maximum buffer size and AQM should reduce congestion control synchronization and increase fair treatment of different TCP flows. • Implementations should be simple to use; perhaps even be the default configuration. • New methods of active queue management must continue to be researched and developed.

  45. Furthering research efforts • Collect more data, 8 – 12 hour samples would be nice. • Build a test network rather than using a simulator. • Use sources of real traffic as testing environment. • Write a program to completely replay all traffic capture on a specific Internet link. (not easy)

  46. Our small research network build!

  47. FreeBSD is useful! • O/S has an in-built firewall for matching specific packet flows. • A kernel module called “DummyNet” exists for research use. • DummyNet can be configured to buffer traffic for extra time • Uses mbufs – BSD ring buffer to delay traffic • Danger of overflowing delay buffer • Full source code is freely available and well documented.

  48. Using the end hosts, generate multiple thousands of TCP data streams. • Simple server listener code that generates character data should be sufficient • Lower the link speed at the center of the network to force the routers to buffer traffic • Modify the ALTQ code to implement different active queue management algorithms. • Connect analyzer host and use tcpflowstat to analyze traffic characteristics.

More Related