Modeling TCP Throughput: A Simple Model and its Empirical Validation

Ross Rosemark Penn State University Modeling TCP Throughput: A Simple Model and its Empirical Validation

Goals of Presentation • Intro to TCP • Particularly discuss the “Reno flavor of TCP” • Show how TCP is innately a self tuning algorithm • Discuss the goals of this paper • State assumptions • Note I do not discus in details any math. • Discuss Analysis • State goals of analysis • State analysis conclusions • Analyize Paper • What contributions does it make • Other these contributions significant?

What is TCP? • According to “Computer Networking” by “James Kurose” TCP acronym for: • Transmission Control Protocol • TCP together with UDP form the very core of today's Interenet transport layer. • TCP is a protocol to send messages between two machines.

So what is the difference between TCP and UDP? • As discussed, there are two ways to send messages.. • UDP • TCP • The difference between these protocols is that: • UDP is NOT guaranteed to deliver a packet • It only does a “Best Effort” approach • TCP is guaranteed to deliver a packet

How does TCP work? • When explaining TCP assume that there are two nodes (Node A and B) and Node A is sending a packet to Node B utilizing TCP. • High Level example of TCP under ideal circumstances • Node A sets a timer (TimeoutTimer) • Node A sends a packet (segment) to Node B • This segment contains a unique sequence number • Node B sends an Ack to Node A • Tells Node A that it received the msg. • Node A’s TimeoutTimer expires • Does nothing since received Ack from Node B • Node B stores the bytes it receives in a buffer • Application gets bytes from buffer

So what are some cases where TCP is not Ideal? • In TCP a packet can be lost during several locations. • When the packet is transmitted from A to B • At Node B • When the packet it transmitted from B to A

How can you detect that packets have been lost? • There are two ways. • Timeout • Triple acknowledgement

Timeout • Typically congestion is the reason messages are lost due to timeouts • Node A knows it’s message is lost if: • It’s timer expires and it has not received an Ack from Node B • If a message is lost • Node A resets it’s timer • Node A resends the packet • Same sequence number • Process repeats

Quick Discussion… Based on the discussion…. how is TCP a self tuning algorithm. • In TCP if node A’s timer expires before an ack is received it sets it’s timer for double it’s previous timer. • For instance, if Node A waits 10 seconds for an ack for packet 1 • After 10 seconds if node A receives no Ack • Node A retransmits packet A • Sets it timer for 20 seconds • Once a packet an Ack received it sets it’s timer back to the original 10 secs.

Triple Acknowledgement • Another way of determining packet loss • First things to note • In TCP packets must be delivered in order • Also Node A (i.e. sender) can send multiple packets at a time • Technically what is a triple acknowledgement • When a sender receives multiple Acks for the same packet

Triple Acknowledgement Example • Node A sends 5 packets • Node A never receives an Ack for the second one • Node B receives packets 1, 3,4, 5 • When node B receives packets 3, 4, 5 send an ack that signifies that it never seen packet 2 • Node 2 receives 3 acks for the same packet it retransmits lost packet

How can you limit packet loss? • What happens if Node A sends packets quicker then Node B processes the packet. • In this case Node B’s buffer overflows • To ensure this does not happen TCP implements: • Flow control process • Speed matching service… matching the rate at which the sender is sending to the rate at which the receiving application is reading

How is TCP implement flow control? • Node B has a buffer • The available space varies depending on how many packets received it received from Node A as well as the rate the Node B is processing the packet. • Every time Node B sends a packet to Node A, Node B specifies the remaining size of it’s buffer • Node A determines the number of packets it can send without overflowing node B’s buffer • After Node A sends these packets it waits until it receives an Ack from Node B. • Once it receives an Ack Node A can send another packet. • This window’s size varies over time.

Example of Window Size

Does dropped packets affect Congestion Control? • Oops… not only can node B experience congestion but routers between Node A and B can experience congestion • How do you adjust the rate A sends packets based on these also. • If Node A experiences packet loss as a result of timeout • it cuts it’s window size in half. • It then incrementally increases it’s window size as it receives Acks. (slow start) • When the window size hits a threshold it grows it’s window exponentially • If Node A experiences packet loss as a result of Triple Ack • It cuts it’s window size in half • It then incrementally increases it’s window size as it receives Acks. . (slow start) • When the window size hits window size hits ½ the value it had before the timeout it exponentially grows the window

Quick Discussion… Again how is TCP a self tuning algorithm? • As discussed TCP self tunes to alleviate congestion • I.e. in TCP a node’s window size fluctuates based on the ability of Node B to process packets.

Reno Flavor TCP • In this paper they only consider the Reno flavor of TCP • They argue it’s most widely used on the internet • Difference between TCP and TCP reno. • Whenever Node A is in a slow start and receives a triple Ack, it cancels the slow start and starts growing it’s window exponentially • Known as fast recovery

Paper • Paper Title: “Modeling TCP Throughput: A simple model and it’s empirical validation” • Goals of paper: • Model TCP empirically (i.e. through a lot of math) • Predict throughput • Previous work • People have previously simulated TCP in simulation • Not sufficient as does not offer “TCP-friendly”, throughput for a non-TCP flow that interacts with a TCP connection… • For instance FTPF interacting with TCP

Paper • In this paper the authors state: • Their approach captures • TCP’s fast retransmit mechanism (done in previous work) • TCP’s timeout mechanism on throughput (not done in previous work) • Their approach is able to accurately predict throughput over a significantly wider range of loss rate than before.

Assumptions • They model TCP congestion avoidance behavior in terms of rounds • i.e. round starts with the back-to-back transmission of W packets, where W is the current size of the TCP congestion widow. • Once all packets in this window sent no further packets are sent until first Ack is received • The duration of a round is equal to the round trip time and is assumed to be independent of the window size. • Also assume that the time it takes to send all packets in a window is less then the round trip time. • A receiver sends one Ack for every two packets received

Assumptions (Cont) • A packet is lost in a round independently of any packets lost in other rounds • Packet loss is correlated among back to back transmissions in a round. • i.e. one packet in a round dropped all subsequent packets in the round dropped.

Before we forget let’s analyze these assumptions based on what we know about TCP. • Are these assumptions acceptable? • They model TCP congestion avoidance behavior in terms of rounds • Makes sense… TCP does the same thing • The duration of a round is equal to the round trip time and is assumed to be independent of the window size. • I would assume the window size does impact the round time since the window size dictates how many packets are sent.. • Hence this effects the length of the timeout timer since the last packet in the window’s timer will go off before the first packet in the widow’s timer. • The time it takes to send all packets in a window is less then the round trip time. • No necessarily.. A node could sets its timer to short and have to periodically adjust it as it further learns the time it takes to send a packet to the receiver

Assumption Analysis • A receiver sends one Ack for every two packets received • Sounds good to me… I don’t know what the problem with this would be. • A packet is lost in a round independently of any packets lost in other rounds • Not valid.. • What happens if node A sends packet 1 to node B in round 1 via Router C • This packet fills Router C’s queue • Node A realizes this packet is lost and resends this packet to B which B receives and sends Ack which A receives • In another round Node A sends packet 2 to node B via Router C • Since packet 1 is still filling up router C’s buffer, this packet is dropped. • As you can see packet1 effects packet2

Assumption Analysis • Packet loss is correlated among back to back transmissions in a round. • Very invalid… • Node A sends in round 1 packet 1 to node B via router 1 • Packet lost at router 1 • Node A then sends in round 1 packet 2 to node B via router 2 • Packet received by Node B • Also don’t understand how you can get triple acks if all subsequent packets are lost? • Anyone?

What do they develop empirical models for? • Develop stochastic model of TCP congestion control in several steps, corresponding to it’s operating regimes • When loss indication are exclusively from triple duplicate acks • When loss indications are both Timeout and triple duplicate • When congestion window is limited by receiver’s advertised window. • I don’t get into the math at all…

Measurements and Trace Analysis • Experimental setup • Obtained data from 37 TCP connections established between 18 hosts scattered across US and Europe • Operating systems at these hosts varied form win95-sun-linux • Mostly flavors of linux • TCP varies between approaches • In linux 2 acks instead of 3 for triple duplicate Ack • All data sets are for unidirectional bulk transfers • i.e. FTP

Measurements and Trace analysis • Example of data utilized in experiments • Shows overview of traces of machine. • They would like to point out how Timeout (TO) dominated the packets lost. • This data is used to validate their models

Measurements and Trace analysis • Broke trace data from 1 hour into 100 second intervals. • Each point is a 100 sec interval • Plotted in lines • Another papers model • Considers only triple duplicate acks • Their model (proposed full) • TD – no timeout only triple duplicate acks • TO -> at least one timeout but no exponential backoff • T1 -> experiences one exponential back off • i.e. timeout after a timeout • T2 -> experience two exponential back off • T3-> experience three exponential back off

Measurements and Trace analysis • Their approach better estimates observed model. • Their approach • proposed (Approximate) • Other approach • TD Only

Problems with their model • Their approach is not always good • In this graph the other model is a better estimator • They believe because in this data a lot of duplicate acks are transpire • Recall in the other paper they are only trying to estimate duplicate acks • In this paper they are also estimating timeouts.. Etc…

Questions?

Modeling TCP Throughput: A Simple Model and its Empirical Validation

Modeling TCP Throughput: A Simple Model and its Empirical Validation

Presentation Transcript

Model Validation and Construction with Application to Recovery Modeling

A Unified Framework for Modeling TCP-Vegas, TCP-SACK, and TCP-Reno

TCP latency modeling

Model Validation

Modeling TCP Throughput

Model Validation

Modeling the Behavior of a DVB- RCS Satellite Network: an Empirical Validation

Empirical Modeling Process

An introduction to TCP and its modeling

Empirical Modeling

TCP/IP and DoD Model (TCP/IP Model)

Parallel TCP Sockets: Simple Model, Throughput and Validation

Empirical Modeling and Controller Design

Interconnect throughput modeling

Modeling TCP Throughput: A Simple Model and its Empirical Validation

Empirical Landslide Modeling

Empirical Model

Simple Delay and Throughput Analysis