180 likes | 305 Views
Web100 and Net100. Outline. The Problem The Web100 solution Web100 design Web100 components/ implementation Net100 Net100 design Net100 components. THE PROBLEM. TCP’s simple, uniform, reliable data delivery service is a good fit for most applications but …
E N D
Web100 and Net100 Anant Mudambi, U. Virginia
Outline • The Problem • The Web100 solution • Web100 design • Web100 components/ implementation • Net100 • Net100 design • Net100 components
THE PROBLEM • TCP’s simple, uniform, reliable data delivery service is a good fit for most applications but … • Applications unable to utilize full network bandwidth because … • TCP hides details of lower layers/ implementation so difficult to “tweak” • Overly cautious default parameter values • Implementation bugs
PURPOSE OF WEB100 • Ordinary network users able to get performance that is orders of magnitude lower than experts : “Wizard Gap” • Web100 aims to close this gap • By making TCP’s internal parameter values accessible at user level • By providing an interface to modify settable parameters • Diagnosis and solution (to an extent)
WEB100 DESIGN • Kernel network stack modifications to collect per-connection statistics on significant protocol events : TCP instrumentation • API to access instruments; implemented as a library • Set of application tools to read/ visualize these instruments and set parameters
WEB100 DESIGN Figure taken from [1]
WEB100 COMPONENTSKernel Instrument Set (KIS) • Implemented on Linux kernels • Each connected TCP socket has a structure attached to collect state information (variables) and change parameters (controls) • Variables, controls exposed using the /proc interface • More than 130 variables; more than 10 controls
KIS Examples Variables: • Local address/ port, remote address/ port • TCP state, SACK/ Timestamps enabled • Packets out/ in, bytes out/ in • Sequence numbers : oldest unACKed, initial, next to be sent, next expected • Congestion signals, current cwnd/ ssthresh • # Fast retx, # timeouts, packets/ bytes retxed • Sample/ smoothed RTT, RTT variance, RTO Controls: • Socket send/ receive buffer, Max. ssthresh, AI, MD
WEB100 COMPONENTSAPI Library • Each TCP connection assigned a Connection ID : CID • Directory /proc/web100/<CID> for each connection • Variables collected into ‘groups’; one file for each group in /proc/web100/<CID>/ • API provides routines to read individual variables or the whole group atomically (snapshot)
WEB100 COMPONENTSApplication Tools • Some tools developed to use the web100 API • Command line tools to read/ write values; useful in scripts • GUI tools to see how variables evolve over time • GUI triage tool to estimate where performance bottleneck lies: sender, receiver or network
NET100 • Web100 – Allows user to see inside/ below the TCP layer • But to use the full network capacity, user must know what to look for and what to tune • Net100 goal = OS directly tunes network flows, so application/ user need not be network aware : Network Aware OS • Accelerate TCP over high-speed, long delay networks by reducing packet loss and speeding recovery from loss
NET100 DESIGN • Instrumented kernel : Web100 • Network probes & monitors : Network Tool Analysis Framework (NTAF) • TCP modifications and a daemon to tune TCP flows : Work Around Daemon (WAD)
INSTRUMENTED KERNEL • Net100 leverages the KIS implemented in the Web100 kernel • Web100 kernel extended to permit tuning of more than just send/ receive buffers • Slow start and congestion avoidance/ recovery schemes made tunable
NETWORK TOOL ANALYSIS FRAMEWORK • The WAD, in order to work around specific network problems, requires external information not available to regular flows • NTAF provides this tuning information for specific paths • Framework for running network test tools (enhanced to collect Web100 data), storing results in a database • Currently 5 NTAF servers deployed • Tools : ping, pipechar, iperf, netest, GridFTP etc.
WORK AROUND DAEMON • A daemon to tune TCP flows as they come up • Written in C (a python version also exists) • Tuning controlled by a configuration file • Static tuning : parameters specified in config. file are used • Dynamic tuning : tuning uses information from the NTAF
WAD IMPLEMENTATION • To be able to tune TCP flows WAD needs to know when a flow starts; 2 ways: • Kernel notifies WAD of connection tear down/ establishment using a netlink socket • WAD periodically checks for new/ closed connections by walking /proc/web100 • On noticing a new connection, WAD checks in the config. file if it is tunable • If so use Web100 API to tune the flow
WAD CONFIGURATION • Src address/ port, Dst address/ port • Send buffer, Rcv buffer size • AI, MD factors • Max. ssthresh : used in modified slow start • Scalable TCP AI • Floyd AIMD calculation
REFERENCES [1] M. Mathis, J. Heffner, R. Reddy, Web100: Extended TCP Instrumentation, ACM Computer Communications Review, July 2003 [2] B. Tierney, T. Dunigan, M. Mathis, Net100: Developing Network-aware Operating Systems, MICS Final Report, Sept. 2004 [3] T. Dunigan, M. Mathis, B. Tierney, A TCP Tuning Daemon, SuperComputing 2002 [4] www.web100.org [5] www.csm.ornl.gov/~dunigan/net100/ Thank you!