330 likes | 471 Views
Lecture 8: TCP and Congestion Control. ITCS 6166/8166 091 Spring 2007 Jamie Payton Department of Computer Science University of North Carolina at Charlotte February 5, 2007. Slides adapted from: Congestion slides for Computer Networks: A Systems Approach (Peterson and Davis)
E N D
Lecture 8:TCP and Congestion Control ITCS 6166/8166 091 Spring 2007 Jamie Payton Department of Computer Science University of North Carolina at Charlotte February 5, 2007 Slides adapted from: Congestion slides for Computer Networks: A Systems Approach (Peterson and Davis) Chapter 3 slides for Computer Networking: A Top Down Approach Featuring the Internet (Kurose and Ross)
Announcements • Textbook is on reserve in library • Homework 2 will be assigned on Wednesday • Due: Feb. 14
Transmission Control Protocol • Implementation of sliding window protocol TCP uses sliding windows at sender and receiver (buffers) TCP Segment (Packet) Structure
TCP Details • Views data as ordered streams of bytes (not packets) • Sequence numbers are over bytes • Seq # for segment = # of first byte • Acknowledgements are cumulative • Acks for next expected byte # • What happens on out-of-order segments? • Option 1: discard • Option 2: wait and see if other segments show up • This approach taken in practice
Seq. #’s: byte stream “number” of first byte in segment’s data ACKs: seq # of next byte expected from other side cumulative ACK Q: how receiver handles out-of-order segments A: TCP spec doesn’t say, - up to implementor time TCP Sequence Numbers Host B Host A User types ‘C’ Seq=42, ACK=79, data = ‘C’ host ACKs receipt of ‘C’, echoes back ‘C’ Seq=79, ACK=43, data = ‘C’ host ACKs receipt of echoed ‘C’ Seq=43, ACK=80 simple telnet scenario
£ SWS … … LAR LFS TCP: Sender • Assign sequence number to each segment (SeqNum) • Maintain three state variables: • send window size (SWS) – upper bound on the number of outstanding (unACKed) frames that the sender can send • last acknowledgment received (LAR) • last frame sent (LPS) • Maintain invariant: LPS - LAR <= SWS • Advance LAR when ACK arrives • Buffer up to SWS frames (in case retransmit required) • Timeout associated with each frame
TCP: Receiver • Maintain three state variables • receive window size (RWS) – upper bound on the number of out-of-order frames the receiver can accept • largest acceptable packet (LAP) • last packet received (LPR) • Maintain invariant: LAP - LPR <= RWS • Packet SeqNum arrives: • if LPR < SeqNum < = LAP accept • if SeqNum < = LPR or SeqNum > LAP discard • Send cumulative ACKs £ RWS … … LPR LAP Buffer out of order packets!
TCP ACK generation[RFC 1122, RFC 2581] TCP Receiver action Delayed ACK. Wait up to 500ms for next segment. If no next segment, send ACK Immediately send single cumulative ACK, ACKing both in-order segments Immediately send duplicate ACK, indicating seq. # of next expected byte Immediate send ACK, provided that segment starts at lower end of gap Event at Receiver Arrival of in-order segment with expected seq #. All data up to expected seq # already ACKed Arrival of in-order segment with expected seq #. One other segment has ACK pending Arrival of out-of-order segment higher-than-expect seq. # . Gap detected Arrival of segment that partially or completely fills gap
TCP Retransmission Timeout • RTT estimate important for efficient operation • Too long: long delay before retransmission • Too short: unnecessary retransmission • TCP RTT estimation • SampleRTT: measured time from segment transmission until ACK receipt • SampleRTT will vary, want estimated RTT “smoother” • average several recent measurements, not just current SampleRTT
TCP Round Trip Time EstimatedRTT = (1- )*EstimatedRTT + *SampleRTT • Exponential weighted moving average • influence of past sample decreases exponentially fast • typical value: = 0.125
EstimtedRTT plus “safety margin” large variation in EstimatedRTT -> larger safety margin First estimate of how much SampleRTT deviates from EstimatedRTT: Setting the TCP Timeout DevRTT = (1-)*DevRTT + *|SampleRTT-EstimatedRTT| (typically, = 0.25) Then set timeout interval: TimeoutInterval = EstimatedRTT + 4*DevRTT
Flow Control • What happens if the receiving process is slow and the sending process is fast? • TCP provides flow control • sender won’t overflow receiver’s buffer by transmitting too much, too fast
Receiver advertises spare room to sender: RcvWindow= RcvBuffer-[LastByteRcvd-LastByteRead] includes RcvWindow in segments Sender keeps track of how much spare room receiver has in its variable RcvWindow Sender limits unACKed data to RcvWindow guarantees receive buffer doesn’t overflow How Flow Control Works
Flow Control • Send buffer size: MaxSendBuffer • Receive buffer size: MaxRcvBuffer • Receiving side • LastByteRcvd - LastByteRead < = MaxRcvBuffer • AdvertisedWindow = MaxRcvBuffer - (NextByteExpected - NextByteRead) • Sending side • LastByteSent - LastByteAcked < = AdvertisedWindow • EffectiveWindow = AdvertisedWindow - (LastByteSent - LastByteAcked) • LastByteWritten - LastByteAcked < = MaxSendBuffer • block sender if (LastByteWritten - LastByteAcked) + y > MaxSenderBuffer • Always send ACK in response to arriving data segment • Persist when AdvertisedWindow= 0
Congestion: Informally: “too many sources sending too much data too fast for network to handle” Different from flow control! Manifestations: lost packets (buffer overflow at routers) long delays (queueing in router buffers) A top-10 problem in computer network research! Congestion Control
TCP Congestion Control • Idea • assumes best-effort network (FIFO or FQ routers) • each source determines network capacity for itself • uses implicit feedback • ACKs pace transmission (self-clocking) • Challenges • determining the available capacity in the first place • adjusting to changes in the available capacity
TCP Congestion Control Fundamentals • Additive Increase/Multiplicative Decrease • Slow start • Fast Retransmit and Fast Recovery
Additive Increase/Multiplicative Decrease • Objective: adjust to changes in the available capacity • New state variable per connection: CongestionWindow • Counterpart to flow control’s advertised window • limits how much data source has in transit MaxWin = MIN(CongestionWindow, AdvertisedWindow) EffWin = MaxWin - (LastByteSent - LastByteAcked) • Idea: • increase CongestionWindow when congestion goes down • decrease CongestionWindow when congestion goes up • Now EffectiveWindow includes both flow control and congestion control
AIMD (cont) • Question: how does the source determine whether or not the network is congested? • Answer: a timeout occurs • timeout signals that a packet was lost • packets are seldom lost due to transmission error • lost packet implies congestion
Source Destination … AIMD (cont) • Algorithm • increment CongestionWindowby one packet every time an entirewindow’s worth of data is successful (linear increase) • divide CongestionWindow by two whenever a timeout occurs (multiplicative decrease) • think about the concepts in termsof packets, even though its implemented in bytes • In practice: increment a little for each ACKIncrement = (MSS * MSS)/CongestionWindow CongestionWindow += Increment
70 60 50 40 KB 30 20 10 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0 T ime (seconds) AIMD (cont) • Trace: sawtooth behavior • proven to provide more stability than additive increase/additive decrease • the consequences of having too large of a window are much worse than having too small
Why AIMD? • Fairness • If you run at any of the other combinations (AIAD, MIAD, MIMD), common sequences of events can result in unfair distribution among competing flows • R. Jain and K.K Ramakrishnan., Congestion Avoidance in Computer Networks with a Connectionless Network Layer: Concepts, Goals, and Methodology, in Proceedings of the Computer Networking Symposium, pp. 134—143, April 1988. • Stability • If you’re slow starting, you KNOW that ½ of the window is OK, so go back to there. • If you’re steady-state sending, then the reason your packet was dropped is likely because a new flow started up • V. Jacobson., Congestion Avoidance and Control, in Proceedings of the SIGCOMM Symposium, pp. 314—329, August 1988.
Slow Start Source Destination • Objective: determine the available capacity at first • increase congestion window rapidly from a cold start • Idea: • begin with CongestionWindow = 1 packet • double CongestionWindow each RTT (increment by 1 packet for each ACK)
Slow Start (cont.) • Why “slow”? • Starts slow in comparison to immediately filling the advertised window • Prevents routers from having to handle bursts of initial traffic.
Slow Start (cont) • when first starting connection • the source has no idea what resources are available • Slow start continues to double CongWin until a loss occurs, then enters additive increase/multiplicative decrease • when connection goes dead waiting for timeout • receiver reopens entire window • use slow start to ramp up to previous CongWin/2 • Trace • Problem: lose up to half a CongestionWindow’s worth of data • if you hit on the border of the network’s capacity (e.g., send n bytes worth of data successfully, so double window to 2n, but network can only support n)
Fast Retransmit and Fast Recovery • Problem: coarse-grain TCP timeouts lead to idle periods • Fast retransmit: use duplicate ACKs to trigger retransmission faster than normal • does not replace regular timeout but enhances it • duplicate ACK suggests problem at receiver • three duplicate ACKs triggers a retransmit for the missing packet
Fast Retransmit Results • no more long, flat periods where no packets are sent, waiting for a timeout • eliminates about half of the timeouts in practice • Fast recovery • skip the slow start phase • go directly to half the last successful CongestionWindow (ssthresh)
Congestion Avoidance • TCP’s strategy • control congestion once it happens • repeatedly increase load in an effort to find the point at which congestion occurs, and then back off • i.e., cause congestion in order to control it • Alternative strategy • predict when congestion is about to happen • reduce rate before packets start being discarded • call this congestion avoidance, instead of congestion control
TCP Vegas • End host management of congestion avoidance • Watch for (implicit) signs that congestion is building • e.g. steady increase in RTT, flattening of the perceived throughput
TCP Vegas • While we’re increasing the congestion window, the perceived throughput stays about the same • i.e., the extra stuff is being queued • TCP Vegas tries to reduce the amount of this extra stuff
Algorithm • Let BaseRTT be the minimum of all measured RTTs (commonly the RTT of the first packet) • If not overflowing the connection, then ExpectedRate = CongestionWindow/BaseRTT • Source calculates sending rate (ActualRate) once per RTT • Record the send time for a particular packet, record how many bytes are transmitted before the ack comes, compute the sample RTT for that packet, and divide the number of bytes transmitted in the middle by the RTT • Source compares ActualRate with ExpectedRate Diff = ExpectedRate - ActualRate if Diff < a increase CongestionWindow linearly (not using bandwidth effectively) else if Diff > b decrease CongestionWindow linearly (moving towards congestion) else leave CongestionWindow unchanged
Algorithm (cont) • Parameters(in practice) • a = 1 packet • b = 3 packets