FAST TCP

FAST TCP Cheng Jin David Wei Steven Low netlab.CALTECH.edu

Acknowledgments • Caltech • Bunn, Choe, Doyle, Hegde, Jayaraman, Newman, Ravot, Singh, X. Su, J. Wang, Xia • UCLA • Paganini, Z. Wang • CERN • Martin • SLAC • Cottrell • Internet2 • Almes, Shalunov • MIT Haystack Observatory • Lapsley, Whitney • TeraGrid • Linda Winkler • Cisco • Aiken, Doraiswami, McGugan, Yip • Level(3) • Fernes • LANL • Wu

Outline • Motivation & approach • FAST architecture • Window control algorithm • Experimental evaluation skip: theoretical foundation

Congestion control Example congestion measure pl(t) • Loss (Reno) • Queueing delay (Vegas) pl(t) xi(t)

pl(t) • AQM: • DropTail • RED • REM/PI • AVQ xi(t) TCP: • Reno • Vegas TCP/AQM • Congestion control is a distributed asynchronous algorithm to share bandwidth • It has two components • TCP: adapts sending rate (window) to congestion • AQM: adjusts & feeds back congestion information • They form a distributed feedback control system • Equilibrium & stability depends on both TCP and AQM • And on delay, capacity, routing, #connections

Difficulties at large window • Equilibrium problem • Packet level: AI too slow, MD too drastic • Flow level: required loss probability too small • Dynamic problem • Packet level: must oscillate on binary signal • Flow level: unstable at large window 5

ACK: W  W + 1/W Loss: W  W – 0.5W • Packet level • Flow level • Equilibrium • Dynamics pkts Packet & flow level Reno TCP (Mathis formula)

Reno TCP • Packet level • Designed and implemented first • Flow level • Understood afterwards • Flow level dynamics determines • Equilibrium: performance, fairness • Stability • Design flow level equilibrium & stability • Implement flow level goals at packet level

Reno TCP • Packet level • Designed and implemented first • Flow level • Understood afterwards • Flow level dynamics determines • Equilibrium: performance, fairness • Stability Packet level design of FAST, HSTCP, STCP guided by flow level properties

ACK: W  W + 1/W Loss: W  W – 0.5W • Reno AIMD(1, 0.5) ACK: W  W + a(w)/W Loss: W  W – b(w)W • HSTCP AIMD(a(w), b(w)) ACK: W  W + 0.01 Loss: W  W – 0.125W • STCP MIMD(a, b) • FAST Packet level

Flow level: Reno, HSTCP, STCP, FAST • Similarflow level equilibrium pkts/sec (Mathis formula) a = 1.225 (Reno), 0.120 (HSTCP), 0.075 (STCP)

Flow level: Reno, HSTCP, STCP, FAST • Commonflow level dynamics! window adjustment control gain flow level goal = • Different gain k and utility Ui • They determine equilibrium and stability • Different congestion measure pi • Loss probability (Reno, HSTCP, STCP) • Queueing delay (Vegas, FAST)

Implementation strategy • Commonflow level dynamics window adjustment control gain flow level goal = • Small adjustment when close, large far away • Need to estimate how far current state is wrt target • Scalable • Window adjustment independent of pi • Depends only on current window • Difficult to scale

Outline • Motivation & approach • FAST architecture • Window control algorithm • Experimental evaluation skip: theoretical foundation

<RTT timescale RTT timescale Loss recovery Architecture

Architecture Each component • designed independently • upgraded asynchronously

Architecture Each component • designed independently • upgraded asynchronously Window Control

FAST-TCP basic idea Uses delay as congestion measure • Delay provides finer congestion info • Dealy scales correctly with network capacity • Can operate with low queuing delay Loss Loss Based TCP Queue Delay FAST C Window

Window control algorithm • Full utilization • regardless of bandwidth-delay product • Globally stable • exponential convergence • Fairness • weighted proportional fairness • parameter a

Outline • Motivation & approach • FAST architecture • Window control algorithm • Experimental evaluation • Abilene-HENP network • Haystack Observatory • DummyNet

Abilene Test OC48 OC192 Periodic losses every 10mins (Yang Xia, Harvey Newman, Caltech)

Periodic losses every 10mins (Yang Xia, Harvey Newman, Caltech)

FAST backs off to make room for Reno Periodic losses every 10mins (Yang Xia, Harvey Newman, Caltech)

Haystack Experiments Lapsley, MIT Haystack

Haystack - 1 Flow (Atlanta-> Japan) • Iperf used to generate traffic. • Sender is a Xeon 2.6 Ghz • Window was constant: • Burstiness in rate due to • Host processing and ack spacing. Lapsley, MIT Haystack

Haystack – 2 Flows from 1 machine (Atlanta -> Japan) Lapsley, MIT Haystack

Linux Loss Recovery • All outstanding packets marked as lost. • SACKs reduce lost packets 2. Lost packets retransmitted slowlyas cwnd is capped at 1 (bug). Timeout

DummyNet Experiments • Experiments using emulated network. • 800 Mbps emulated bottleneck in DummyNet. Receiver PC Dual Xeon 2.6Ghz 2Gb Intel GbE Linux 2.4.22 Sender PC Dual Xeon 2.6Ghz 2Gb Intel GbE Linux 2.4.22 DummyNet PC Dual Xeon 3.06Ghz 2Gb FreeBSD 5.1 800Mbps

Dynamic sharing: 3 flows FAST Linux Dynamic sharing on Dummynet • capacity = 800Mbps • delay=120ms • 3 flows • iperf throughput • Linux 2.4.x (HSTCP: UCL)

Dynamic sharing: 3 flows FAST Linux Steady throughput HSTCP BIC

30min queue FAST Linux loss throughput Dynamic sharing on Dummynet • capacity = 800Mbps • delay=120ms • 14 flows • iperf throughput • Linux 2.4.x (HSTCP: UCL) HSTCP STCP

30min queue Room for mice ! FAST Linux loss throughput HSTCP HSTCP BIC

Average Queue vs Buffer Size Dummynet • capacity = 800Mbps • Delay =200ms • 1 flows • Buffer size: 50, …, 8000 pkts (S. Hedge, B. Wydrowski, etc, Caltech)

Is large queue necessary for high throughput?

netlab.caltech.edu/FAST • FAST TCP: motivation, architecture, algorithms, performance. IEEE Infocom March 2004 • b-release: April 2004 Source freely available for any non-profit use

ideal performance Aggregate throughput Dummynet: cap = 800Mbps; delay = 50-200ms; #flows = 1-14; 29 expts

small window 800pkts large window 8000 Aggregate throughput Dummynet: cap = 800Mbps; delay = 50-200ms; #flows = 1-14; 29 expts

HSTCP ~ Reno Jain’s index Fairness Dummynet: cap = 800Mbps; delay = 50-200ms; #flows = 1-14; 29 expts

stable in diverse scenarios Stability Dummynet: cap = 800Mbps; delay = 50-200ms; #flows = 1-14; 29 expts

netlab.caltech.edu/FAST • FAST TCP: motivation, architecture, algorithms, performance. IEEE Infocom March 2004 • b-release: April 2004 Source freely available for any non-profit use

BACKUP Slides

IP Rights • Caltech owns IP rights • applicable more broadly than TCP • leave all options open • IP freely available if FAST TCP becomes IETF standard • Code available on FAST website for any non-commercial use

NSF WAN in Lab Caltech: John Doyle, Raj Jayaraman, George Lee, Steven Low (PI),Harvey Newman,Demetri Psaltis, Xun Su, Yang Xia Cisco: Bob Aiken, Vijay Doraiswami, Chris McGugan, Steven Yip netlab.caltech.edu

Key Personnel • Raj Jayaraman, CS • Xun Su, Physics • Yang Xia, Physics • George Lee, CS • 2 grad students • 3 summer students • Cisco engineers • Steven Low, CS/EE • Harvey Newman, Physics • John Doyle, EE/CDS • Demetri Psaltis, EE Cisco • Bob Aiken • Vijay Doraiswami • Chris McGugan • Steven Yip

? DummyNet EmuLab ModelNet WAIL PlanetLab Abilene NLR DataTAG CENIC WAIL etc NS SSFNet QualNet JavaSim Mathis formula Optimization Control theory Nonlinear model Stocahstic model Spectrum of tools log(cost) log(abstraction) live nk WANiLab emulation simulation math …we use them all

live nk emulation simulation math WANiLab Critical in development e.g. Web100 Spectrum of tools

Goal State-of-the-art hybrid WAN • High speed, large distance • 2.5G  10G • 50 – 200ms • Wireless devices connected by optical core • Controlled & repeatable experiments • Reconfigurable & evolvable • Built in monitoring capability

WAN in Lab • 5-year plan • 6 Cisco ONS15454 • 4 routers • 10s servers • Wireless devices • 800km fiber • ~100ms RTT V. Doraiswami (Cisco) R. Jayaraman (Caltech)

WAN in Lab • Year-1 plan • 3 Cisco ONS 15454 • 2 routers • 10s servers • Wireless devices V. Doraiswami (Cisco) R. Jayaraman (Caltech)

Hybrid Network • Scenarios: • Ad hoc network • Cellular network • Sensor network • How optical core • supports wireless • edges? X. Su (Caltech)

FAST TCP

FAST TCP

Presentation Transcript

TCP

FAST TCP

FAST TCP for Multi-Gbps WAN: Experiments and Applications

FAST TCP

FAST TCP: Motivation, Architecture, Algorithms, Performance

FAST TCP

Fast TCP

FAST TCP: Motivation, Architecture, Algorithms, Performance

FAST TCP : From Theory to Experiments

FAST TCP

FAST TCP

Simulation based analysis of FAST TCP using OMNET++

Status of FAST TCP and other TCP alternatives

FAST TCP: design and experiments

FAST TCP I: motivation, approach, architecture

FAST TCP: From Theory to Experiments

FAST TCP

FAST TCP in Linux

FAST TCP

Status of FAST TCP and other TCP alternatives