550 likes | 682 Views
CSS432 End-to-End Protocols Textbook Ch5.1 – 5.2. Prof. Athirai Irissappane http://courses.washington.edu/css432/athirai/ athirai@uw.edu. Outline. Transport Layer Protocols Communication between applications running in the end nodes hence end-to-end protocols
E N D
CSS432 End-to-End ProtocolsTextbook Ch5.1 – 5.2 Prof. Athirai Irissappane http://courses.washington.edu/css432/athirai/ athirai@uw.edu CSS432: End-to-End Protocols
Outline • Transport Layer Protocols • Communication between applications running in the end nodes hence end-to-end protocols • How to convert host-to-host packet delivery service into a process-to-process communication channel • Simple Demultiplexer (UDP) • Reliable Byte Stream (TCP) CSS432: Internetworking
End-to-End Protocols • Common properties that a transport protocol can be expected to provide • Guarantees message delivery • Delivers messages in the same order they were sent • Delivers at most one copy of each message • Supports arbitrarily large messages • Supports synchronization between the sender and the receiver • Allows the receiver to apply flow control to the sender • Supports multiple application processes on each host CSS432: Internetworking
M5, M5, M3 End-to-End Protocols • Typical limitations of the network on which transport protocol will operate • Drop messages • Reorder messages • Deliver duplicate copies of a given message • Limit messages to some finite size • Deliver messages after an arbitrarily long delay • Challenge for Transport Protocols • Develop algorithms that turn the above limitations into the high level of service required by the application Request a retransmission of M3 and M5 M5, and M3 M5 M2, M1, M4 M5, M4, M3, M2, M1 M5, M2, M1 M4 M4, M3 M3
Simple Demultiplexor (UDP) • Extends host-to-host delivery service of the underlying network into a process-to-process communication service • Adds a level of demultiplexing which allows multiple application processes on each host to share the network • Unreliable and Unordered Datagram Service • No flow control: preventing senders from overrunning the capacity of the receivers (messages are discarded if the receiving buffers are full) CSS432: End-to-End Protocols
0 16 31 SrcPort DstPort Checksum Length Data Simple Demultiplexor (UDP) • Identify each process by port id associated to the sender and receiver • Endpoints identified by ports • servers have well-known ports • see /etc/services file on Unix • Header format • Optional checksum • Calculated using psuedo header + UDP header + data • Pseudoheader: IP header (protocol number) + source IP + destination IP, UDP length field CSS432: End-to-End Protocols
int sd; sd = socket(AF_INET, SOCK_DGRAM, 0); socket() socket() socket() struct sockaddr_in server; struct hostent *hp, *gethostbyname( ); Server.sin_family = AF_INET; hp = gethostbyname( ); bcopy( hp->h_addr, &( server.sin_addr.s_addr ), sizeof( hp->h_length) ); server.sin_port = htons( 12345 ); struct sockaddr_in server; server.sin_family =AF_INET; server.sin_addr.s_addr = htonl( INADDR_ANY) server.sin_port = htons( 12345 ); bind( sd, (struct sockaddr *)&server, sizeof( server ) ); 13579 13579 bind() bind() sendto( sd, buf, sizeof( buf ), 0, (struct sockaddr *)&server, sizeof( server ) ); Port:12345 Port: 13579 recv( sd, buf, sizeof( buf ), 0 ); recv() recv() sendto() UDP: Simple Demultiplexer Packets demultiplexed UDP M5, M4, M3, M2, M1 CSS432: End-to-End Protocols
End-to-End Protocols • Common end-to-end services • guarantee message delivery • deliver messages in FIFO order • deliver at most one copy of each message • support arbitrarily large messages • support synchronization • allow the receiver to flow control the sender • support multiple application processes on each host P1 P2 P3 P4 M5, M4, M3, M2, M1 Network m5, m4, m3, m2, m1 CSS432: End-to-End Protocols
TCP Overview CSS432: End-to-End Protocols Connection Oriented It guarantees that all sent packets will reach the destination in the correct order Use of ACK packets, re-transmission, time out Full duplex: bi-directional, send and receive at each end Flow control: keep sender from overrunning receiver Congestion control: keep sender from overrunning network
Application process Application process W rite Read bytes bytes … … TCP TCP Send buffer Receive buffer … Segment Segment Segment T ransmit segments TCP (Reliable Byte Stream) • Byte-oriented protocol, sender writes bytes into a TCP connection and the receiver reads bytes out of the TCP connection. • Byte Stream: Not individual bytes • source application buffers enough bytes from the sending process to fill a reasonably sized packet (segment) and sends to destination • Destination empties the contents of the segment into a receive buffer, and the receiving process reads from this buffer at its leisure. CSS432: End-to-End Protocols
socket() socket() socket() bind() listen() connect() connect() accept() write() write() buf2, buf1 buf2, buf1 read() read() Sockets (Code Example) int sd, newSd; sd = socket(AF_INET, SOCK_STREAM, 0); int sd = socket(AF_INET, SOCK_STREAM, 0); sockaddr_in server; bzero( (char *)&server, sizeof( server ) ); server.sin_family =AF_INET; server.sin_addr.s_addr = htonl( INADDR_ANY ) server.sin_port = htons( 12345 ); bind( sd, (sockaddr *)&server, sizeof( server ) ); struct hostent *host = gethostbyname( arg[0] ); sockaddr_in server; bzero( (char *)&server, sizeof( server ) ); server.sin_family = AF_INET; server.s_addr = inet_addr( inet_ntoa( *(struct in_addr*)*host->h_addr_list ) ); server.sin_port = htons( 12345 ); listen( sd, 5 ); connect( sd, (sockaddr *)&server, sizeof( server ) ); sockaddr_in client; socklen_t len=sizeof(client); while( true ) { newSd = accept(sd, (sockaddr *)&client, &len); write( sd, buf1, sizeof( buf ) ); write( sd, buf2, sizeof( buf ) ); if ( fork( ) == 0 ) { close( sd ); read( newSd, buf1, sizeof( buf1 ) ); read( newSd, buf2, sizeof( buf2 ) ); } close( newSd ); } close( newsd); exit( 0 ); CSS432: End-to-End Protocols
Data Link Versus Transport • Data Link layer transfers data between two adjacent nodes (a single point-to-point physical link) whereas transport layer provides communication between processes running in different hosts • need explicit connection establishment and termination • Single physical point-to-point link fixed RTT, TCP connections Potentially different RTT (Round Trip Time) as they connect different hosts anywhere on the internet • TCP connection between nodes in same room or across network (different RTT) • need adaptive timeout mechanism for re-transmissions • Point-to-point link, packets received in FIFO order. In TCP, they can be re-ordered as they cross internet, e.g., long delay in network, re-transmission, etc • need to be prepared for arrival of very old packets • Packets slightly out of order can be corrected using SeqNum of sliding window protocol • How late a packet can be? need to set MSL (Maximum Segment Lifetime) CSS432: End-to-End Protocols
Data Link Versus Transport • Hosts connected to point-to-point are engineered to support the link. Hosts at both ends have similar resources. • If windowsize = bandwidth*RTT, sender and receiver likely to have window size buffer But for a TCP connection, resources dedicated to the TCP connection such as buffer space, etc, can vary, especially if one of the host supports multiple TCP connections • need to accommodate different node capacity (flow control) • Potentially different network capacity. In a directly connected point-to-point link, the bandwidth of the link is known and transmitter cannot send faster than the bandwidth and not possible to congest the network. In TCP, what links will be traversed is not known before hand, and multiple sources can traverse via the same link. • need to be prepared for network congestion CSS432: End-to-End Protocols
Data (SequenceNum) Sender Receiver Acknowledgment + AdvertisedWindow Segment Format (TCP Header) • Each TCP connection identified with 4-tuple: • (SrcPort, SrcIPAddr, DsrPort, DstIPAddr) • Sliding window + flow control • Acknowledgment, SequenceNum, AdvertisedWindow • Flags • SYN, FIN, RESET, PUSH, URG, ACK • SYN (Synchronize): Establishing a connection • FIN (Finish): terminating a connection • RESET: Confused and Terminating • PUSH: Section 5.2.7 • URG: Sending urgent data • ACK: Validating acknowledgment field • SequenceNum is incremented in all cases other than ACK. CSS432: End-to-End Protocols
Segment Format (TCP Header) • The SrcPort and DstPort fields identify the source and destination ports, respectively. • The Acknowledgment, SequenceNum, and AdvertisedWindow fields are all involved in TCP’s sliding window algorithm. • Because TCP is a byte-oriented protocol, each byte of data has a sequence number; the SequenceNum field contains the sequence number for the first byte of data carried in that segment. • The Acknowledgment and AdvertisedWindow fields carry information about the flow of data going in the other direction. CSS432: End-to-End Protocols
Segment Format (TCP Header) • The 6-bit Flags field is used to relay control information between TCP peers. • The possible flags include SYN, FIN, RESET, PUSH, URG, and ACK. • The SYN and FIN flags are used when establishing and terminating a TCP connection, respectively. • The ACK flag is set any time the Acknowledgment field is valid, implying that the receiver should pay attention to it. • The URG flag signifies that this segment contains urgent data. When this flag is set, the UrgPtr field indicates where the nonurgent data contained in this segment begins. • The urgent data is contained at the front of the segment body, up to and including a value of UrgPtr bytes into the segment. • The PUSH flag allow the sender to tell TCP it should (send) flush all bytes collected to its peer and also notify it to the receiving side.. • Finally, the RESET flag signifies that the receiver has become confused, it received a segment it did not expect to receive—and so wants to abort the connection. • Finally, the Checksum field is used in exactly the same way as for UDP—it is computed over the TCP header, the TCP data, and the pseudoheader, which is made up of the source address, destination address, and length fields from the IP header. CSS432: End-to-End Protocols
Active participant Passive participant (client) (server) Flag=SYN, SequenceNum = x , y 1 + SYN + ACK, SequenceNum = x Acknowledgment = ACK, Acknowledgment = y + 1 TCP Connection Establishment and Termination • Tree-Way Handshake • Client • Initiate a connection to a server by sending segment with seq=x • Set a timer and retransmit the request upon an expiration • Server • Acknowledge the client request with ack=++x • Initiate a reverse connection with its own start sequence num seq=y • Set a timer and retransmit the request upon an expiration • Client • Acknowledge the server request with ack=++y (next seq num expected) • X and y chosen at random • Segment from earlier incarnation of same connection can interfere with a later incarnation of the connection CSS432: End-to-End Protocols
CLOSED Active open /SYN Passive open Close Close LISTEN SYN/SYN + ACK Send/ SYN SYN/SYN + ACK SYN_RCVD SYN_SENT ACK SYN + ACK/ACK Close /FIN ESTABLISHED Close /FIN FIN/ACK FIN_WAIT_1 CLOSE_WAIT FIN/ACK ACK Close /FIN ACK + FIN/ACK FIN_WAIT_2 CLOSING LAST_ACK Timeout after two ACK ACK segment lifetimes (2 * MSL) FIN/ACK TIME_WAIT CLOSED State Transition Diagram • Open • Active open • client • connect( ) • Passive open • server • listen( ) • Close • Active close • client or server • First close( ) • Both side can be active • Passive close • client or server • close( ) in response to the first close( ) CSS432: End-to-End Protocols
State Transition Diagram • States involved in opening and closing a TCP connection • Anything between is hidden (ESTABLISHED) • Each box represents the state of one end of TCP connection • All connections start with CLOSED state • As connection progresses, it moves from state to state • Each arc represents the event/action. Two kinds of events: (1) a segment arrives from peer (2) local operation on TCP • One/both sides can close connection. If one side alone closes, then it cannot send segments but can receive them • Each arc is labeled using event/action. i.e., When event happens at a given state, it moves to the next state and takes the action CSS432: End-to-End Protocols
State Transition Diagram CLOSED Active open /SYN Passive open Close Close • Opening a connection: • CLOSED to LISTEN:Server invokes passive open on TCP waits for conn req. • CLOSED to SYN_SENT:Client invokes an active open, moves to SYN_SENT state and SYN segment sent to server. • LISTEN to SYN_RCVD: Server receives the SYN segment from client, moves to SYN_RCVD state, sends SYN+ACK to client • SYN_SENT to ESTABLISHED: Client receives the SYN+ACK from server, moves to ESTABLISHED state, sends an ACK back to server • SYN_RCVD to ESTABLISHED: Server receives ACK from client and moves to ESTABLISHED LISTEN SYN/SYN + ACK Send/ SYN SYN/SYN + ACK SYN_RCVD SYN_SENT ACK SYN + ACK/ACK Close /FIN ESTABLISHED Close /FIN FIN/ACK FIN_WAIT_1 CLOSE_WAIT FIN/ACK ACK Close /FIN ACK + FIN/ACK FIN_WAIT_2 CLOSING LAST_ACK Timeout after two ACK ACK segment lifetimes (2 * MSL) FIN/ACK TIME_WAIT CLOSED CSS432: End-to-End Protocols
CLOSED Active open /SYN Passive open Close Close LISTEN SYN/SYN + ACK Send/ SYN SYN/SYN + ACK SYN_RCVD SYN_SENT ACK SYN + ACK/ACK Close /FIN ESTABLISHED Close /FIN FIN/ACK FIN_WAIT_1 CLOSE_WAIT FIN/ACK ACK Close /FIN ACK + FIN/ACK FIN_WAIT_2 CLOSING LAST_ACK Timeout after two ACK ACK segment lifetimes (2 * MSL) FIN/ACK TIME_WAIT CLOSED State Transition Diagram • This Side can close connection first • Other Side can close connection first • Both close the connection at the same time • Closing a connection: • ESTABLISHED TO FIN_WAIT_1: Server sends termination request FIN. Waits for ACK from client • ESTABLISHED TO CLOSE_WAIT: Client receives FIN, sends ACK to server, waits for its own local FIN • CLOSE_WAIT TO LAST_ACK:Client sends own FIN to server waits for ACK • FIN_WAIT_1 TO FIN_WAIT_2: ACK received from client, wait for FIN from client • FIN_WAIT_2 TO TIME_WAIT: FIN received from client. Server sends ACK to Client, waits for enough time until client receives ACK • LAST_ACK TO CLOSED: Client receives ACK from server, moves to CLOSED • TIME_WAIT TO CLOSED: Server waits 2*MSL, moves to CLOSED
CLOSED Active open /SYN Passive open Close Close LISTEN SYN/SYN + ACK Send/ SYN SYN/SYN + ACK SYN_RCVD SYN_SENT ACK SYN + ACK/ACK Close /FIN ESTABLISHED Close /FIN FIN/ACK FIN_WAIT_1 CLOSE_WAIT FIN/ACK ACK Close /FIN ACK + FIN/ACK FIN_WAIT_2 CLOSING LAST_ACK Timeout after two ACK ACK segment lifetimes (2 * MSL) FIN/ACK TIME_WAIT CLOSED State Transition Diagram • Closing a connection: • ESTABLISHED TO FIN_WAIT_1: Server, Client send termination requests FIN. • FIN_WAIT_1 TO CLOSING: Server/client receive FIN from each other and send ACK, wait for own ACK • CLOSING TO TIME_WAIT: Server/Client receive ACK for the FIN they sent, wait for enough time until other peer receives ACK they sent • TIME_WAIT TO CLOSED: Peer waits 2*MSL, move to CLOSED • FIN_WAIT_1 TO TIME_WAIT: Server/client receive FIN and ACK for the FIN that they sent simultaneously CSS432: End-to-End Protocols
State Transition Diagram • In what condition can the state transit from FIN_WAIT_1 to TIME_WAIT? • What is the purpose of the TIME_WAIT state? • TCP is given a chance to resend the final ACK. • Client sends FIN, Server receives FIN • ACK sent by server can be delayed, Client times out ( 1 MSL) • Client resends FIN, it can also be delayed (1 MSL) • If no TIME_WAIT, new TCP connection can get the delayed FIN and close connection CSS432: End-to-End Protocols
Timing Chart Client Server ( connect( ) ) SYN_SENT LISTEN ( listen( ) ) SYN seq=x SYN_RCVD Establishment SYN seq=y, ACK=x + 1 ESTABLISHED ACK=y + 1 ESTABLISHED ( write( ) ) seq=x+1 ACK=y + 1 ( read( ) ) Data Transfer ACK x + 2 ( close( ) ) FIN_WAIT_1 FIN seq=x+2 ACK=y + 1 CLOSE_WAIT ACK x + 3 Termination FIN seq = y + 1 FIN_WAIT_2 LAST_ACK( close( ) ) TIME_WAIT ACK=y + 2 Peek such a flow with tcpdump in assignment 3. CSS432: End-to-End Protocols
Sliding Window Revisited • TCP’s variant of the sliding window algorithm, which serves several purposes: • (1) it guarantees the reliable delivery of data, • (2) it ensures that data is delivered in order, and • (3) it enforces flow control between the sender and the receiver CSS432: End-to-End Protocols
Sending application Receiving application TCP TCP LastByteWritten LastByteRead LastByteAcked LastByteSent NextByteExpected LastByteRcvd Sliding Window (Reliable & Ordered Delivery) • Sending side • LastByteAcked≤LastByteSent Receiver cannot ack the byte not sent • LastByteSent≤LastByteWrittenCannot send a byte that has not been written to the send buffer • buffer bytes between [LastByteAcked, LastByteWritten] • Receiving side • LastByteRead< NextByteExpected Byte cannot be read by receiver until received • NextByteExpected≤ LastByteRcvd+1 NextByteExpected points to the start of first gap when data arrive out of order • buffer bytes between [LastByteRead, LastByteRcvd]
Sliding Window (Reliable & Ordered Delivery) • Receiving side • LastByteReadData that has been received and also application has read it from the TCP buffer • NextByteExpectedData that has not been received and expected as the next byte • LastByteRcvdData that has been received and in the receiver TCP buffer • LastByteRead< NextByteExpected Bytes which are expected can’t be read as they have not yet reached the receiver • The next expected byte points to the byte right after the last byte received if data is received in order, therefore NextByteExpected= LastByteRcvd+1 • If due to some reason, data has arrived out of order, NextByteExpected will point to the first gap in the data NextByteExpected≤ LastByteRcvd+1 • buffer bytes between [LastByteRead, LastByteRcvd]
Flow Control • Keep sender from overrunning receiver • MaxSendBuffer, MaxRcvBuffer for sender, receiver • Window is amount of data that can be sent without waiting for ACK • Receiver advertises the sender a window <= MaxRcvBuffer • To avoid overflowing receiver buffer, TCP on receiver sider must keep, LastByteRcvd − LastByteRead ≤ MaxRcvBuffer • AdvertisedWindow of receiver: Amount of free space in receive buffer = MaxRcvBuffer − ((NextByteExpected − 1) − LastByteRead) • If rate of reading = rate of receiving, Advertised Window = MaxRcvBuffer • If rate of reading slower, LastByteRcvd increases and Advertised Window shrinks to 0 CSS432: End-to-End Protocols
Flow Control • Sender should adhere to receiver’s advertised window • LastByteSent − LastByteAcked ≤ AdvertisedWindow • Sender computes an effective window that limits how much data it can send to receiver • EffectiveWindow = AdvertisedWindow − (LastByteSent − LastByteAcked) • Local application process must not overflow the send buffer • LastByteWritten − LastByteAcked ≤ MaxSendBuffer • If the sending process tries to write y bytes to TCP, but (LastByteWritten − LastByteAcked) + y > MaxSendBuffer then TCP blocks the sending process and does not allow it to generate more data. CSS432: End-to-End Protocols
Flow Control Receiving application Sending application TCP TCP LastByteRead LastByteWritten y LastByteSent NextByteExpected LastByteRcvd LastByteAcked LastByteRcvd – LastByteRead ≤ MaxRcvbuffer LastByteSent – LastByteAcked ≤ AdvertisedWindow AdvertisedWindow = MaxRcvBuffer – (NextByteExpected – NextByteRead) EffectiveWindow = AdvertisedWindow – (LastByteSent – LastByteAcked) Send ACK with an advertise window in response to arriving data segments • as long as all the preceding bytes have also arrived and • until the advertised window reaches 0. (ACK returned at the first time when it reaches 0) LastByteWritten – LastByteAcked ≤ MaxsendBuffer block sender if (LastByteWritten - LastByteAcked) + y > MaxSenderBuffer CSS432: End-to-End Protocols
y y Flow Control with A Slower Receiver Receiving application Sending application Read slow. TCP TCP LastByteRead LastByteWritten LastByteSent LastByteRcvd LastByteAcked NextByteExpected LastByteRcvd – LastByteRead ≤ MaxRcvbuffer LastByteSent – LastByteAcked ≤ AdvertisedWindow 0 0 < 0 AdvertisedWindow = MaxRcvBuffer – (NextByteExpected – LastByteRead) EffectiveWindow = AdvertisedWindow – (LastByteSent – LastByteAcked) 0 No more send, no more ack, thus it stays In the same value LastByteWritten – LastByteAcked ≤ MaxsendBuffer block sender since (LastByteWritten - LastByteAcked) + y > MaxSenderBuffer CSS432: End-to-End Protocols
Flow Control • The sender won’t send any more data. • The receiver won’t initiate to send any advertised window. • Then, how can the sender find out when the receiver can receive more data? CSS432: End-to-End Protocols
Protection Against Wrap Around • 32-bit SequenceNum • 2^32 numbers: 0 to 2^32-1 • After 2^31-1? Start from 0 (wrap around) • MSL (Maximum Segment Lifetime) = 120sec < wrap around time (time taken to exhaust all sequence numbers) Bandwidth Time Until Wrap Around T1 (1.5 Mbps) 6.4 hours Ethernet (10 Mbps) 57 minutes T3 (45 Mbps) 13 minutes FDDI (100 Mbps) 6 minutes STS-3 (155 Mbps) 4 minutes STS-12 (622 Mbps) 55 seconds STS-24 (1.2 Gbps) 28 seconds CSS432: End-to-End Protocols
Keeping the Pipe Full • Utilize full bandwidth: sender transmit RTT*Bandwidth data • Sender transmission restricted by Receiver Advertised Window • Receiver Advertised Window should be enough to accommodate RTT*Bandwidth data • But 16-bit AdvertisedWindow = 64KB (2^16 bytes) Bandwidth RTT(100msec) x Bandwidth Product T1 (1.5 Mbps) 18KB Ethernet (10 Mbps) 122KB T3 (45 Mbps) 549KB FDDI (100 Mbps) 1.2MB STS-3 (155 Mbps) 1.8MB STS-12 (622 Mbps) 7.4MB STS-24 (1.2 Gbps) 14.8MB CSS432: End-to-End Protocols
Segment Transmission A segment is transmitted out: • When a segment to send reaches Maximum segment size (MMS) = Maximum Transfer Unit (MTU) • When a TCP receives a push operation that flushes the unsent data, data is pushed as and when written instead of waiting for segment to be filled (Peek with tcpdump in programming assignment 3) • When a timer fires CSS432: End-to-End Protocols
Silly Window Syndrome • If you think of a TCP stream as a conveyer belt with “full” containers (data segments) going in one direction and empty containers (ACKs) going in the reverse direction, then MSS-sized segments correspond to large containers and 1-byte segments correspond to very small containers. • If the sender aggressively fills an empty container as soon as it arrives, then any small container introduced into the system remains in the system indefinitely. • That is, it is immediately filled and emptied at each end, and never coalesced with adjacent containers to create larger containers.
Silly Window Syndrome Silly Window Syndrome
Silly Window Syndrome small MMS 2 Sender MMS 1 Receiver • If a sender aggressively takes advantage of any available window, • The receiver empties every window regardless of its size and thus small windows will never disappear. • The problem occurs only when either the sender transmits a small segment or the receiver opens the window a small amount • The receiver can delay ACKs to make a larger window • How long does it wait? • The sender should make a decision • Nagle’s Algorithm (Programming assignment 3) Ad Window Ad Window CSS432: End-to-End Protocols
Nagle’s Algorithm • If there is data to send but the window is open less than MSS, then we may want to wait some amount of time before sending the available data • But how long? • If we wait too long, then we hurt interactive applications like Telnet • If we don’t wait long enough, then we risk sending a bunch of tiny packets and falling into the silly window syndrome • The solution is to introduce a timer and to transmit when the timer expires
Nagle’s Algorithm • We could use a clock-based timer, for example one that fires every 100 ms • Nagle introduced an elegant self-clocking solution • Key Idea • As long as TCP has any data in flight, the sender will eventually receive an ACK • This ACK can be treated like a timer firing, triggering the transmission of more data
Nagle’s Algorithm When the application produces data to send if both the available data and the window ≥ MSS send a full segment else if there is unACKed data at the sender // if ACK not received for buffer the new data until an ACK arrives previous data sent else // if no unACKed data send all the new data now
Nagle’s Algorithm • Ack works as a timer to fire a new segment transmission. • intentionally delays packets. • Time sensitive applications or real-time applications cannot afford such a delay • TCP_NODELAY option in Socket Interface: Transmit data as soon as possible • setsockopt(sockfd, SOL_TCP, TCP_NODELAY, &intFlag, sizeof(intFlag)) CSS432: End-to-End Protocols
Adaptive Retransmission • TCP retransmits segment if ACK not received within timeout • Timeout determined based on RTT • RTT between different pair of hosts in internet different • How to choose timeout? • Adaptive Retransmission
Adaptive Retransmission Original Algorithm (keep running average of RTT) • Measure SampleRTT for each segment/ ACK pair • Record time when you start sending • Record time when you receive ACK • Take difference • Compute weighted average of RTT • EstRTT = ax EstRTT + b x SampleRTT • where a+b = 1 • a between 0.8 and 0.9 • b between 0.1 and 0.2 • Set timeout based on EstRTT • TimeOut=2 x EstRTT • Why double? EstRTT cannot respond to deviated SampleRTT quickly. CSS432: End-to-End Protocols
Original Algorithm • Problem • ACK does not really acknowledge a transmission • It actually acknowledges the receipt of data • When a segment is retransmitted and then an ACK arrives at the sender • It is impossible to decide if this ACK should be associated with the first or the second transmission for calculating RTTs
Karn/Partridge Algorithm Sender Receiver Sender Receiver • Assume ACK for original transmission and actually it is for retransmission • SampleRTT is too large • Assume ACK for ret ransmission and actually it is for original transmission • SampleRTT is too smal Original transmission Original transmission TT TT ACK Retransmission SampleR SampleR Retransmission ACK CSS432: End-to-End Protocols
Karn/Partridge Algorithm • Do not sample RTT when retransmitting • Can’t figure out which transmission the latest ACK corresponds to. • Whenever TCP retransmits • Set the last timeout to be double the previous value (similar to exponential backoff) • Congestion causes this retransmission. • Do not react aggressively and be more cautious when more time outs happen • Modestly retransmit segments. CSS432: End-to-End Protocols
Karn/Partridge Algorithm • Karn-Partridge algorithm was an improvement over the original approach, but it does not eliminate congestion • We need to understand how timeout is related to congestion • If you timeout too soon, you may unnecessarily retransmit a segment which adds load to the network
Karn/Partridge Algorithm • Main problem with the original computation is that it does not take variance of Sample RTTs into consideration. • If the variance among Sample RTTs is small • Then the Estimated RTT can be better trusted • There is no need to multiply this by 2 to compute the timeout
Karn/Partridge Algorithm • On the other hand, a large variance in the samples suggest that timeout value should not be tightly coupled to the Estimated RTT • Jacobson/Karels proposed a new scheme for TCP retransmission