E N D
1. Advanced Networks 2002 1 Transport Layer Michalis Faloutsos
Many slides from Kurose-Ross
2. Advanced Networks 2002 2 Transport Layer Functionality Hide network from application layer
Transport layer resides at end points
Sees the network as a black box
3. Advanced Networks 2002 3 Transport Layers of the Internet TCP: reliable protocol
Guarantees end-to-end delivery
Self-controls rate: congestion and flow control
Connection oriented: handshake, state
Ordered delivery of packets to application
UDP: unreliable protocol
Non-regulated sending rate
Multiplexing-demultiplexing
4. Advanced Networks 2002 4 TCP overview
5. Advanced Networks 2002 5 TCP: What and How For more: RFCs: 793, 1122, 1323, 2018, 2581 full duplex data:
bi-directional data flow in same connection
MSS: maximum segment size
connection-oriented:
handshaking (exchange of control msgs) init’s sender, receiver state before data exchange
flow controlled:
sender will not overwhelm receiver point-to-point:
one sender, one receiver
reliable, in-order byte steam:
no “message boundaries”
pipelined:
TCP congestion and flow control set window size
send & receive buffers
6. Advanced Networks 2002 6 TCP segment structure
7. Advanced Networks 2002 7 TCP overview TCP is a sliding window protocol
Sender can have (Window) bytes in flight
Operates with cumulative ACKs
It includes control for the sending rate
Flow control: receiver-set sending rate
Congestion control: network-aware sending rate
8. Advanced Networks 2002 8 TCP seq. #’s and ACKs Seq. #’s:
byte stream “number” of first byte in segment’s data
ACKs:
seq # of next byte expected from other side
cumulative ACK
Q: how receiver handles out-of-order segments
A: TCP spec doesn’t say, - up to implementor
9. Advanced Networks 2002 9 TCP in a nutshell I. Slow start phase (actually this is fast increase)
Start with a window of 1 (or 2)
Successful ACK: Increase window by one 1 max size segment
Do this up to a threshold: sshthresh
II. Congestion control phase
Increase window by 1 max size segment every RTT
Drop window in half, if there is congestion
Packet loss: duplicate ACKs
Time expiration
10. Advanced Networks 2002 10 TCP Congestion Control end-end control (no network assistance)
transmission rate limited by congestion window size, Congwin, over segments:
11. Advanced Networks 2002 11 TCP congestion control: Intuition TCP is “probing” for usable bandwidth:
ideally: transmit as fast as possible (Congwin as large as possible) without loss
increase Congwin until loss (congestion)
loss: decrease Congwin, then begin probing (increasing) again
12. Advanced Networks 2002 12 TCP congestion control: TCP has two “phases”
slow start:
start from small, increase quickly
congestion avoidance:
Additive Increase Multiplicative Decrease
important variables:
Congwin
threshold: defines threshold between two slow start phase, congestion control phase
13. Advanced Networks 2002 13 TCP Slowstart exponential increase (per RTT) in window size
loss event: timeout (Tahoe TCP) and/or or three duplicate ACKs (Reno TCP)
14. Advanced Networks 2002 14 Why Call it Slow Start ? The original version of TCP suggested that the sender transmit as much as the Advertised Window permitted.
Routers may not be able to cope with this “burst” of transmissions.
Slow start is slower than the above version -- ensures that a transmission burst does not happen at once.
15. Advanced Networks 2002 15 TCP Congestion Avoidance
16. Advanced Networks 2002 16 TCP Congestion: Real Life is Hairy! Remember: bytes vs packets!
CW += MSS * MSS/CW
Thres = Max( 2* MSS,
InFlightData/2)
MSS: max segment size
InFlighData: un-ACK-ed data RFC 2581: TCP Congestion Control
17. Advanced Networks 2002 17 Fairness goal: if N TCP sessions share same bottleneck link, each should get 1/N of link capacity TCP congestion avoidance:
AIMD: additive increase, multiplicative decrease
increase window by 1 per RTT
decrease “window” by factor of 2 on loss event
18. Advanced Networks 2002 18 Why is TCP fair? Two competing sessions:
Additive increase gives slope of 1, as throughout increases
multiplicative decrease decreases throughput proportionally
19. Advanced Networks 2002 19 Macroscopic Description of Throughput Assume window toggling: W/2 to W
High rate: W * MSS / RTT
Low rate: W * MSS / 2 RTT
Rate increase is linearly between two extremes
Average throughput:
0.75 * W * MSS / RTT
20. Advanced Networks 2002 20 TCP: reliable data transfer
21. Advanced Networks 2002 21 TCP sender
22. Advanced Networks 2002 22 TCP Receiver: ACK generation [RFC 1122, RFC 2581]
23. Advanced Networks 2002 23 TCP: retransmission scenarios
24. Advanced Networks 2002 24 TCP Round Trip Time and Timeout Q: how to set TCP timeout value?
longer than RTT
note: RTT will vary
too short: premature timeout
unnecessary retransmissions
too long: slow reaction to segment loss Q: how to estimate RTT?
SampleRTT: measured time from segment transmission until ACK receipt
ignore retransmissions, cumulatively ACKed segments
SampleRTT will vary, want estimated RTT “smoother”
use several recent measurements, not just current SampleRTT
25. Advanced Networks 2002 25 TCP Round Trip Time and Timeout Setting the timeout
EstimtedRTT plus “safety margin”
large variation in EstimatedRTT -> larger safety margin
26. Advanced Networks 2002 26 A problem
27. Advanced Networks 2002 27 The Karn Patridge Algorithm Take SampleRTT measurements only for segments that have been sent once !
This eliminates the possibility that wrong RTT estimates are factored into the estimation.
Another change -- Each time TCP retransmits, it sets the next timeout to 2 X Last timeout --> This is called the Exponential Back-off (primarily for avoiding congestion).
28. Advanced Networks 2002 28 Jacobson Karels Algorithm An issue with the Karn/Patridge scheme is that it does not take into account the variation between RTT samples.
New method proposed -- the Jacobson Karels Algorithm.
Estimated RTT = Estimated RTT + d X Difference
Difference = Sample RTT - Estimated RTT
Deviation = Deviation + d (|Difference| - deviation)
Timeout = m Estimated RTT + f deviation.
The values of m and f are computed based on experience -- Typically m = 1 and f = 4.
29. Advanced Networks 2002 29 Silly Window Syndrome Suppose a MSS worth of data is collected and advertised window is MSS/2.
What should the sender do ? -- transmit half full segments or wait to send a full MSS when window opens ?
Early implementations were aggressive -- transmit MSS/2.
Aggressively doing this, would consistently result in small segment sizes -- called the Silly Window Syndrome.
30. Advanced Networks 2002 30 Issues .. We cannot eliminate the possibility of small segments being sent.
However, we can introduce methods to coalesce small chunks.
Delaying ACKs -- receiver does not send ACKs as soon as it receives segments.
How long to delay ? Not very clear.
Ultimate solution falls to the sender -- when should I transmit ?
31. Advanced Networks 2002 31 Nagle’s Algorithm If sender waits too long --> bad for interactive connections.
If it does not wait long enough -- silly window syndrome.
How do we solve this?
Timer -- clock based
If both available data and Window = MSS, send full segment.
Else, if there is unACKed data in flight, buffer new data until ACK returns.
Else, send new data now.
Note -- Socket interface allows some applications to turn off Nagle’s algorithm by setting the TCP-NODELAY option.
32. Advanced Networks 2002 32 TCP Connection Management Recall: TCP sender, receiver establish “connection” before exchanging data segments
initialize TCP variables:
seq. #s
buffers, flow control info (e.g. RcvWindow)
client: connection initiator
Socket clientSocket = new Socket("hostname","port number");
server: contacted by client
Socket connectionSocket = welcomeSocket.accept();
33. Advanced Networks 2002 33 TCP Set-up Three way handshake:
Step 1: client end system sends TCP SYN control segment to server
specifies initial seq #
Step 2: server end system receives SYN, replies with SYNACK control segment
ACKs received SYN
allocates buffers
specifies server-> receiver initial seq. #
Step 3: Client replies with an ACK (using servers seq number)
34. Advanced Networks 2002 34 TCP Connection Management (cont.) Closing a connection:
client closes socket: clientSocket.close();
Step 1: client end system sends TCP FIN control segment to server
Step 2: server receives FIN, replies with ACK. Closes connection, sends FIN.
Last ACK is never ACK-ed!!
35. Advanced Networks 2002 35 TCP Connection Management (cont.) Step 3: client receives FIN, replies with ACK.
Enters “timed wait” - will respond with ACK to received FINs
Step 4: server, receives ACK. Connection closed. Sends FIN.
Last ACK is never ACK-ed
36. Advanced Networks 2002 36 TCP Connection Management (cont)