330 likes | 479 Views
Transport Protocols over Circuits/VCs. Master of Engineering Presentation by Helali Bhuiyan Computer Engineering University of Virginia. Outline. Motivation and Problem Statement Related Work Background Types of Circuits/VCs TCP over Circuits/VCs Solutions Conclusions.
E N D
Transport Protocols over Circuits/VCs Master of Engineering Presentation by Helali Bhuiyan Computer Engineering University of Virginia
Outline • Motivation and Problem Statement • Related Work • Background • Types of Circuits/VCs • TCP over Circuits/VCs • Solutions • Conclusions
Motivation and Problem Statement • Motivation • High-bandwidth circuit-switched or virtual-circuit (VC) networks are being used to support eScience projects • Problem Statement • Design transport protocols for different types of circuits/VCs
Related Work • Several UDP-based transport protocols have been developed specifically for circuits • Reliable Blast UDP (RBUDP) • Rate-Adaptive Protocol for Information Delivery (RAPID) • RBUDP+ and RAPID+ • To keep the circuit fully utilized, these solutions try to match their sending rates with the reserved badnwidth • No congestion in circuits, hence no packet loss • Multitasking at the receiving host may cause receive-buffer overflow • Adjust sending rate dynamically based on the feedback received from the receiver
Related Work • TCP (i.e., Reno) is not suitable for high-bandwidth connectionless paths • Packets can be lost due to congestion within the network • It takes a long time to recover from a packet-loss event • Several high-speed variants of TCP have been developed • Higher growth rate of the congestion window leades to lower recovery time • Example: BIC TCP, FAST TCP • High-speed variants of TCP aim to solve the congestion problem on connectionless-network paths • Are they suitable for circuits?
Related Work: User-Space vs. Kernel-Space • UDP-based user-space implementations • Receive-buffer overflows can occur due to multitasking • Receiving host needs to send loss reports • A window-based kernel-level implementation, as in TCP, is a simpler solution • Receiving host sends receive-buffer size within each ACK packet, which reflects the exact state of the host’s loading (multitasking) condition
Outline • Motivation and Problem Statement • Related Work • Background • Types of Circuits/VCs • TCP over Circuits/VCs • Solutions • Conclusions
Types of Circuits/VCs Switch Switch • Different types of circuits/VCs are possible in the data network • Layer-1 circuit: a GbE (Gigabit Ethernet) port is mapped to an equivalent or lower-rate SONET circuit • Layer-2 circuit: VLAN on a GbE port is mapped to a single SONET circuit • Multiplexed Layer-2 circuit: multiple VLANs are mapped to the same SONET circuit GbE GbE SONET Interface SONET Interface
TCP over Circuits/VCs • As TCP was originally designed for connectionless networks, several features of TCP require special attention if we want to use TCP on circuits • Congestion control algorithm • Slow start • Congestion window is increased for each ACK received • Number of outstanding packets increases, if not constrained by TCP buffers • TCP send and receive buffers • TCP buffers smaller than the BDP (bandwidth-delay product) of the path will result in lower throughput • Congestion-window reset • Congestion window is reset if connection is idle for more than one retransmission-timeout • Receive-side autotuning • Size of the receive-side TCP buffer increases gradually • Congestion-window reduced (CWR) state • Overflowing IP-transmission queue causes TCP to enter CWR state
TCP over Circuits: Example Switch B Switch A • Bandwidth-delay product (BDP) is 100 packets • Time to emit a standard 1500 byte packet by a GbE port is 12 us • At OC3 rate, it takes 80 us to forward each packet • Assuming at T = 0, congestion window (cwnd) is 100, and TCP is in congestion avoidance state 155 Mbps GbE RTT = 8 ms GbE Receiver Sender SONET Interface SONET Interface
TCP over Circuits: Example cwnd = 100 Receiver Sender Switch A Buffer T = 0 = Data = ACK
TCP over Circuits: Example cwnd = 100 1 Receiver Sender Switch A Buffer T = 80 us = Data = ACK
TCP over Circuits: Example cwnd = 100 2 1 Receiver Sender Switch A Buffer T = 160 us = Data = ACK
TCP over Circuits: Example cwnd = 100 … 50 49 2 1 Receiver Sender Switch A Buffer T = 4 ms = Data = ACK
TCP over Circuits: Example cwnd = 100 … 51 50 3 2 1 Receiver Sender Switch A Buffer T = 4 ms + 80 us = Data = ACK
TCP over Circuits: Example cwnd = 100 … 52 51 4 3 1 2 Receiver Sender Switch A Buffer T = 4 ms + 160 us = Data = ACK
TCP over Circuits: Example cwnd = 100 … 100 99 52 51 … 1 2 49 50 Receiver Sender Switch A Buffer T = 8 ms = Data = ACK
TCP over Circuits: Example cwnd = 100.01 … 101 100 53 52 … 2 3 50 51 Receiver Sender Switch A Buffer T = 8 ms + 80 us = Data = ACK
TCP over Circuits: Example cwnd = 100.99 … 199 198 151 150 … 100 101 148 149 Receiver Sender Switch A Buffer T = 16 ms = Data = ACK
TCP over Circuits: Example cwnd = 101 … 201 200 199 152 151 … 101 102 149 150 Receiver Sender Switch A Buffer T = 16 ms + 80 us = Data = ACK
TCP over Circuits: Example cwnd = 101.01 … 202 201 200 153 152 … 102 103 150 151 Receiver Sender Switch A Buffer T = 16 ms + 160 us = Data = ACK
TCP over Circuits: Example cwnd = 102 302 … 301 300 299 252 251 … 201 202 249 250 Receiver Sender Switch A Buffer T = 24 ms = Data = ACK
Experimental Results SN16000 SN16000 • Zelda1 is in Atlanta, GA, and Wuneng is in Raleigh, NC • GbE interfaces of two hosts are connected to circuit-switched gateways (SN16000) • An OC3 (155 Mbps) Layer-2 circuit is set up between the two switches • No PAUSE frame • Bandwidth-delay product is 114 packets • Per-port buffer size at each of these switches is 1MB • 700 packets • TCP send and receive buffer sizes in both hosts are set to 4MB 155 Mbps GbE RTT = 8.85 ms GbE Wuneng Zelda1 SONET Interface SONET Interface
Experimental Results • Congestion window growth and instantaneous throughput plot for RenoTCP • 1GB transfer Loss
Experimental Results Loss • Congestion window growth and instantaneous throughput plot for BICTCP • 1GB transfer
Outline • Motivation and Problem Statement • Related Work • Background • Types of Circuits/VCs • TCP over Circuits/VCs • Solutions • Conclusions
Solutions: Tune TCP • Tune TCP buffers to avoid losses • Tune TCP buffers to limit the growth of the number of outstanding packets • User applications are not expected to do the tuning • Solution: use the application-tracing tool ptrace • Ptrace traps system calls made from a user application • Slow start, congestion-window reset, and receive-window autotuning are still unavoidable
Solutions: Circuit TCP (CTCP) • Circuit TCP (CTCP) • CTCP is a modification of TCP, in which the congestion-control software is disabled • The sender maintains a constant congestion window size, matched with the bandwidth-delay product • The receiver also advertises a fixed receive window • Constant window size avoids slow-start, receive-side autotuning and congestion-window reset • CTCP uses TCP’s window-based flow control • Packets cannot be lost due to buffer overflow
CTCP Results • Congestion window growth and instantaneous throughput plot for CTCP
Conclusions • Selected transport protocols to match the characteristics of different types of circuit-switched/VC networks • TCP is a good base transport-protocol choice for circuit-switched/VC networks • Window-based flow control solution • Untuned TCP may lead to packet loss over some types of circuits • Unmodified user applications can tune TCP buffers to avoid loss with the help of a process-tracing tool • Circuit TCP (CTCP) is a better choice, where a fixed number of packets is kept outstanding at all times • Selecting the CTCP socket requires modification to the user application • Process tracing tools can also be used here to select CTCP socket
Thank You Questions?
CTCP Results • Throughput values over different burst sizes • Various time gaps between bursts (0s, 100ms, 1s, 2s) • Retransmission-timeout (RTO) is 209ms
Backup • Contributions • CTCP code • Documented CTCP v1.0 code • Developed API for CTCP v1.0 • Iperf • Modified Iperf code to use CTCP API • Amanda (Advanced Maryland Network Disk Archiver) • Installed and documented a user-friendly installation guide • Ptrace (Process Trace) • Developed software that uses ptrace to trap system calls, and modify system call behavior • CTCP experiments