380 likes | 416 Views
This paper presents the UDT (UDP-based Data Transfer) protocol, which provides reliable, application-level, duplex data transfer over UDP with congestion control. The protocol supports efficient and fair bandwidth utilization, overcoming the limitations of TCP in high bandwidth-delay product (BDP) links. The paper discusses the UDT protocol framework, packet structure, congestion control, and implementation/simulation results.
E N D
UDT: UDP based Data Transfer Yunhong Gu & Robert Grossman Laboratory for Advanced Computing University of Illinois at Chicago
Outline • Background • UDT Protocol • UDT Congestion Control • Implementation/Simulation Results • Summary PFLDnet 2004
Background • Distributed data intensive applications over wide area optical networks: • Grid computing, access of bulk scientific data, data mining, high resolution video, etc. • Transport protocol support: • Efficient and fair bandwidth unitization • TCP does not work! PFLDnet 2004
Trans-Atlantic TCP Performance • Chicago -> Amsterdam, 1Gbps link capacity, 110ms RTT • TCP: 5Mbps @ default setting (64KB buffer) • TCP: 100Mbps @ 12MB buffer (=1Gbps*110ms) • Parallel TCP: 800Mbps @ 64 TCP concurrent flows, with each having 1MB buffer • Two concurrent TCP flows, 1 from Chicago to Amsterdam, 1 within Chicago local networks: • 2Mps vs. 940Mbps! PFLDnet 2004
Why TCP Fails • Discover/recover slow on high BDP links • Increase 1 byte per RTT • Drastic decrease in sending rate • Fairness bias on longer RTT links • More prone to link error in high BDP links B: throughout in packets per second, p: loss rate PFLDnet 2004
Requirements to the New Protocol • FAST • High utilization of the abundant bandwidth either with single or multiplexed connections • FAIR • Intra-protocol fairness, independent of RTT • FRIENDLY • TCP compatibility PFLDnet 2004
Use Scenarios • Small number of sources shares abundant bandwidth • Bulk data transfer • Most of the packets can be packed in maximum segment size (MSS) in a UDT session • MSS can be set up by applications and the optimal value is the path MTU PFLDnet 2004
What’s UDT? • UDT: UDP based Data Transfer • Reliable, application level, duplex, transport protocol, over UDP with congestion control • Implementation: Open source C++ library • Two orthogonal parts • The UDT protocol framework that can be implemented above UDP, with any suitable congestion control algorithms • The UDT congestion control algorithm, which can be implemented in any transport protocols such as TCP PFLDnet 2004
Packet Structure • Data Packet: • Header: 1bit flag + 31bit sequence number • Control Packet: • Header: 1bit flag + 3bit type + 12bit reserved + 16bit ACK seq. no. + (0 - 32n)bit control info • Type: ACK, ACK2, NAK, Handshake, Keep-alive, and Shutdown • Actual size of a UDT packet can be ascertained from UDP header PFLDnet 2004
Data Packet • Flag Bit: 0 • UDT uses 31-bit packet based sequence number, ranging from 0 and (231 - 1) • Sequence number may be wrapped if it exceeds the maximum available number PFLDnet 2004
Control Packet • Flag Bit: 1 • type: 3-bit • handshake (000), shutdown (101), keep-alive (001) • ACK (010), ACK2 (110), NAK (011) • UDT uses sub-sequencing: each ACK and related ACK2 are assigned a 16-bit unique ACK sequence number PFLDnet 2004
Acknowledgements • Selective acknowledgement (ACK) • Generated at every constant interval to send back largest continuously received sequence number of data packets. • The sender sends back an ACK2 to the receiver for each ACK (sub-sequencing). • Also carries RTT, packet arrival speed, and estimated link capacity. • Explicit negative acknowledgement (NAK) • Generated as soon as loss is detected. • Loss information may be resent if receiver has not received the retransmission after an increasing interval. • Loss information is compressed in NAK. PFLDnet 2004
Timing • Packet Scheduling Timer • Tuned by Rate Control • High precision in CPU clock cycles • Rate Control Timer: trigger rate control • RCTP = 0.01 seconds • ACK Timer: trigger acknowledgement • ATP = RCTP PFLDnet 2004
Timing (cont.) • NAK Timer: trigger negative acknowledgement • NTP = RTT • Retransmission Timer: trigger retransmission based on time-out and maintain connection status • RTP = (exp-count + 1) * RTT + ATP where exp-count is the number of continuous time-out PFLDnet 2004
Pkt. Scheduling Timer Sender Sender Sender DATA Recver Recver ACK ACK2 ACK Timer NAK NAK Timer Retransmission Timer Rate Control Timer UDT Architecture PFLDnet 2004
Congestion Control • Rate based congestion control (Rate Control) • RC tunes the packet sending period. • RC is triggered periodically at the sender side. • RC period is constant of 0.01 seconds. • Window based flow control (Flow Control) • FC limits the number of unacknowledged packets. • FC is triggered on each received ACK at the sender side. PFLDnet 2004
… … Rate Control • AIMD: Increase parameter is related to link capacity and current sending rate; Decrease factor is 1/9, but not decrease for all loss events. • Link capacity is probed by packet pair, which is sampled UDT data packets. • Every 16th data packet and it successor packet are sent back to back to form a packet pair. • The receiver uses a median filter on the interval between the arrival times of each packet pair to estimate link capacity. PFLDnet 2004
Rate Control (cont.) 1. If loss rate is greater than 1%, do not increase; 2. Number of packets to be increased in next RCTP time is: where B is estimated link capacity, C is current sending rate. Both are in packets or packets per second. MSS is the packet size in bytes. β = 1.5 * 10-6. 3. Recalculate packet sending period (STP). PFLDnet 2004
Rate Control (cont.) B = 10Gbps, MSS = 1500 bytes PFLDnet 2004
Rate Control (cont.) • Decrease sending rate by 1/9, (or equivalently, increase packet sending period by 1.125), only if • Received an NAK, whose last lost sequence number is greater than the largest sequence number when last decrease occurred; or • The number of loss events since last decrease has exceeded a threshold, which increases exponentially and is reset when condition 1 is satisfied. • No data will be sent out for the next RCTP time if a decrease occurs. • Help to clear congestion. PFLDnet 2004
Flow Control BDP • W = W*0.875 + AS*(RTT+ATP)*0.125 • AS is the packets arrival speed at receiver side. • The receiver records the packet arrival intervals. AS is calculated from the average of latest 16 intervals after a median filter. • It is carried back within ACK. PFLDnet 2004
Slow Start • Flow window starts at 2 and increases to the number of acknowledged packets, until the sender receives an NAK or reaches the maximum window size, when slow start ends. • Packet sending period is 0 during slow start phase and set to the packet arrival interval at the end of the phase. • Slow start only occurs at the beginning of a UDT session. PFLDnet 2004
Implementation: Performance PFLDnet 2004
Implementation: Intra-protocol Fairness PFLDnet 2004
Implementation: TCP Friendliness PFLDnet 2004
Implementation: TCP Friendliness (cont.) PFLDnet 2004
Implementation: File Transfer 1Gbps/15.9ms 1Gbps/110ms Canarie StarLight SARA Disk R: 800Mbps W: 550Mbps Disk R: 800Mbps W: 500Mbps Disk R: 1300Mbps W: 900Mbps PFLDnet 2004
Simulation: UDT Throughput at Different Bandwidth and RTT PFLDnet 2004
Simulation: Performance of Concurrent UDT Flows PFLDnet 2004
Simulation: Intra-protocol Fairness PFLDnet 2004
Simulation: RTT Independence PFLDnet 2004
Simulation: TCP Friendliness PFLDnet 2004
Simulation: Convergence/Stability PFLDnet 2004
100 100 100 10 50 Simulation: Complex Scenario Link capacity Mbps Node DropTail Flow and its ID PFLDnet 2004
B 200 A 200 x C Simulation: Multi-bottleneck PFLDnet 2004
Summary • UDT Protocol • Application level upon UDP • Selective acknowledgement / explicit negative acknowledgement • UDT Congestion Control • Rate Control • Bandwidth estimation for fast probing available bandwidth and fast recovery • AIMD for fairness • Constant rate control interval • Flow Control • Dynamic flow window according to packet receiving speed PFLDnet 2004
UDT Characters • Good use of available bandwidth • Application level - no changes in router and operating system • No manual tuning • Fair and Friendly: intra-protocol fairness, TCP friendliness, and RTT independence. • Open source PFLDnet 2004
Thank You! LAC: www.lac.uic.edu UDT: sourceforge.net/projects/dataspace Internet Draft: draft-gg-udt-01.txt