Net100 PIs: Wendy Huntoon/PSC, Tom Dunigan/ORNL, Brian Tierney/LBNL

Net100 PIs: Wendy Huntoon/PSC,Tom Dunigan/ORNL,Brian Tierney/LBNL Net100 Novel Ideas • Net100 will tune network-UNaware applications based on recent and current link characteristics • Net100 will tune more than just transport buffer sizes, such as • TCP AIMD parameters • DUP threshold • Delayed ACK • Net100 will determine optimal paths and whether to use multiple streams and/or multiple paths • Net100 kernel utilizes passive monitoring from the Web100 kernel Impact and Connections Milestones/Dates/Status • IMPACT: • increase throughput of bulk transfers over high delay, bandwidth networks (like DOE’s ESnet) • select optimal paths and transport parameters for distributed (Grid) application (e.g.: GridFTP) • provide network performance data base from active and passive monitoring • CONNECTIONS: • SciDAC: Astrophysics, Bandwidth Estimation, Data Grid, INCITE, Logistical Networking • Base:Network Monitoring, Data Grid, Transport Protocols • Network probes and sensors Mon/Yr DONE - initial sensor and tool deployment 12/01 12/01 - data base design 4/02 - initial data base implementation 9/02 - final sensor/data base 6/03 •Transport protocol optimizations - protocol analysis 11/02 - initial tuning daemon 3/02 - bulk transfer tuning demos 8/02 - final tuning daemon 6/03 • Multipath support - analytical analysis 8/02 - proof-of-principal routing daemons 12/02 - grid applications demos 4/03 High-Performance Network Research- SciDAC/Base NET100: Developing network-aware operating systems Tasks: -develop/deploy network probes/sensors -develop network metrics data base -develop transport protocol optimizations -develop network-tuning daemon www.net100.org MICS Program Manager: Thomas Ndousse Date Prepared: 1/7/02

Net100 project • New DOE-funded (Office of Science) project ($1M/yr, 3 yrs) • Principal investigators • Wendy Huntoon and the NCAR/PSC/Web100 team (Matt Mathis) • Brian Tierney, LBNL • Tom Dunigan, ORNL • Objective: develop network aware operating systems • optimize and understand end-to-end network and application performance • eliminate the “wizard gap” • Motivation • DOE has a large investment in high speed networks (ESnet) and distributed applications • many network applications are not utilizing the available bandwidth

Net100 approach • Develop Network Tools Analysis Framework (NTAF) • collect data for network tuning • Develop/evaluate/deploy network tools (Enable, NWS, iperf, pipechar, …) • aggregate and transform output from tools and Web100 • Store/query/archive performance data • evaluate network applications over DOE’s ESnet (OC12, OC48,10GigE…) • bulk transfers over high bandwidth/delay network • distributed applications (grid) • Investigate TCP optimizations • simulate/emulate/deploy • Linux kernel mods • Autotune network applications • WAD (workaround daemon)

Web100 summary • NSF funded (NCAR/PSC) web100.org • Modified Linux kernel (2.4.9) • instrumented kernel to read/set TCP variables for a specific flow • readable: RTT, counts (bytes, pkts, retransmits,dups), state (SACKs, windowscale, cwnd, ssthresh) (115 variables!) • settable: buffer sizes • GUI to display/modify a flow’s TCP variables, real-time • API for network-aware applications • Early evaluators: ANL,SLAC, LBNL, ORNL, universities

Motivation • bulk transfers are slow • faster links (OC12, OC48, 10GigE ), but long delay • classic TCP tuning problem • also broken TCP stacks • Under-provisioned routers/switches • TCP is lossy, slow to recover • tune it or replace it? • Compute/data grids • sense/probe link bandwidths/latencies • schedule/configure distributed application

TCP losses Packet losses during startup, linear recovery 0.5 Mbs instantaneous Packet loss average Early packet drops

TCP tuning (workarounds) • Avoid losses • retain/probe for “optimal” buffer sizes • ECN capable routers/hosts • reduce bursts (TCP vegas) • Faster recovery • bigger MSS (jumbo frames) • speculative recovery (D-SACK) • modified congestion avoidance? • Autotune (WAD variables) • Buffer size • Dupthresh • Del ACK, Nagle • AIMD • Virtual MSS

Tuning opportunities • Parallel streams (psockets) • how to choose number of streams, buffer sizes? • autotune ? • Application routing daemons • indirect TCP • alternate path (Wolski, UCSB) • multipath (Rao, ORNL) • Other protocols (SCTP, DCP) • Out of order delivery • rate-based • Are these fair?

Work-around Daemon (WAD) • Version 0 • passively collect flow data • tune unknowing sender/receiver • config file with “tuning info” ? • Based on Web100/Linux 2.4 • To be done • collecting tuning info • adding more knobs to kernel • Related work • Feng’s Dynamic Right Sizing • Linux 2.4 auto-tuning/caching • Mathis TCP buffer tunning

Network Tool Analysis Framework (NTAF) • Configure and launch network tools • measure bandwidth/latency (iperf, pchar, pipechar) • collect passive data (SNMP from routers, OS/Web100 counters) • forecast bandwidth/latency for grid resource scheduling • augment tools to report Web100 data • Collect and transform tool results into a common format • Save results for short-term auto-tuning and archive for later analysis • compare predicted to actual performance • measure effectiveness of tools and auto-tuning • Auto-tune network applications • WAD (WorkAround Daemon) • tunable TCP stack

Net100 interactions • Net100 is both a producer and consumer of network performance data • Active probes (Claffy Bandwidth Estimation, INCITE) • Passive sensors (LBL Network monitoring) • Auto-tuning • TCP optimizations (Feng/LANL, Linux 2.4) • smart transfer (IQecho, Logistical networking) • non-TCP protocols (DCP, STP, SCTP, rate-based, ?) • Net100 tuning could be applied to distributed applications • Climate/Probe, SuperNova, DataGrids • interact with Grid metaware (forecasting, scheduling, tuning) http://www.net100.org

Net100 PIs: Wendy Huntoon/PSC, Tom Dunigan/ORNL, Brian Tierney/LBNL