210 likes | 308 Views
Network diagnostics made easy. Matt Mathis 3/17/2005. The Wizard Gap. The non-experts are falling behind. Year Experts Non-experts Ratio 1988 1 Mb/s 300 kb/s 3:1 1991 10 Mb/s 1995 100 Mb/s 1999 1 Gb/s 2003 10 Gb/s 3 Mb/s 3000:1 2004 40 Gb/s Why?.
E N D
Network diagnostics made easy Matt Mathis 3/17/2005
The non-experts are falling behind • Year Experts Non-experts Ratio • 1988 1 Mb/s 300 kb/s 3:1 • 1991 10 Mb/s • 1995 100 Mb/s • 1999 1 Gb/s • 2003 10 Gb/s 3 Mb/s 3000:1 • 2004 40 Gb/s Why?
TCP tuning requires expert knowledge • By design TCP/IP hides the ‘net from upper layers • TCP/IP provides basic reliable data delivery • The “hour glass” between applications and networks • This is a good thing, because it allows: • Old applications to use new networks • New application to use old networks • Invisible recovery from data loss, etc • But then (nearly) all problems have the same symptom • Less than expected performance • The details are hidden from nearly everyone
TCP tuning is really debugging • Six classes of bugs limit performance • Too small TCP retransmission or reassembly buffers • Packet losses, congestion, etc • Packets arriving out of order or even duplicated • “Scenic” IP routing or excessive round trip times • Improper packet sizes (MTU/MSS) • Inefficient or inappropriate application designs
TCP tuning is painful debugging • All problems reduce performance • But the specific symptoms are hidden • But any one problem can prevent good performance • Completely masking all other problems • Trying to fix the weakest link of an invisible chain • General tendency is to guess and “fix” random parts • Repairs are sometimes “random walks” • Repair one problem at time at best
The Web100 project • When there is a problem, just ask TCP • TCP has the ideal vantage point • In between the application and the network • TCP already “measures” key network parameters • Round Trip Time (RTT) and available data capacity • Can add more • TCP can identify the bottleneck • Why did it stop sending data? • TCP can even adjust itself • “autotuning” eliminates one of the 6 classes of bugs See: www.web100.org
Key Web100 components • Better instrumentation within TCP • 120 internal performance monitors • Poised to become Internet standard “MIB” • TCP Autotuning • Selects the ideal buffer sizes for TCP • Eliminate the need for user expertise • Basic network diagnostic tools • Requires less expertise than prior tools • Excellent for network admins • But still not useful for end users
Web100 Status • Two year no-cost extension • Can only push standardization after most of the work • Ongoing support of research users • Partial adoption • Current Linux includes (most of) autotuning • John heffner is maintaining patches for the rest of Web100 • Microsoft • Experimental TCP instrumentation • Working on autotuning (to support FTTH) • IBM “z/OS Communications Server” • Experimental TCP instrumentation
The next step • Web100 tools still require too much expertise • They are not really end user tools • Too easy to over look problems • Current diagnostic procedures are still cumbersome • New insight from web100 experience • Nearly all symptoms scale with round trip time • New NSF funding • Network Path and Application Diagnosis • 3 Years, we are at the midpoint
Nearly all symptoms scale with RTT • For example • TCP Buffer Space, Network loss and reordering, etc • On a short path TCP can compensate for the flaw • Local Client to Server: all applications work • Including all standard diagnostics • Remote Client to Server: all applications fail • Leading to faulty implication of other components
Examples of flaws that scale • Chatty application (e.g., 50 transactions per request) • On 1ms LAN, this adds 50ms to user response time • On 100ms WAN, this adds 5s to user response time • Fixed TCP socket buffer space (e.g., 32kBytes) • On a 1ms LAN, limit throughput to 200Mb/s • On a 100ms WAN, limit throughput to 2Mb/s • Packet Loss (e.g., 1% loss with 9kB packets) • On a 1ms LAN, models predict 500 Mb/s • On a 100ms WAN, models predict 5 Mb/s
Review • For nearly all network flaws • The only symptom is reduced performance • But the reduction is scaled by RTT • On short paths many flaws are undetectable • False pass for even the best conventional diagnostics • Leads to faulty inductive reasoning about flaw locations • This is the essence of the “end-to-end” problem • Current state-of-the-art relies on tomography and complicated inference techniques
Our new technique • Specify target performance for S to RC • Measure the performance from S to LC • Use Web100 to collect detailed statistics • Loss, delay, queuing properties, etc • Use models to extrapolate results to RC • Assume that the rest of the path is ideal • Pass/Fail on the basis of extrapolated performance
Example diagnostic output End-to-end goal: 4 Mb/s over a 200 ms path including this sectionTester at IP address: xxx.xxx.115.170 Target at IP address: xxx.xxx.247.109Warning: TCP connection is not using SACKFail: Received window scale is 0, it should be 2.Diagnosis: TCP on the test target is not properly configured for this path.> See TCP tuning instructions at http://www.psc.edu/networking/perf_tune.htmlPass data rate check: maximum data rate was 4.784178 Mb/sFail: loss event rate: 0.025248% (3960 pkts between loss events)Diagnosis: there is too much background (non-congested) packet loss.The events averaged 1.750000 losses each, for a total loss rate of 0.0441836%FYI: To get 4 Mb/s with a 1448 byte MSS on a 200 ms path the total end-to-end loss budget is 0.010274% (9733 pkts between losses).Warning: could not measure queue length due to previously reported bottlenecks Diagnosis: there is a bottleneck in the tester itself or test target (e.g insufficient buffer space or too much CPU load)> Correct previously identified TCP configuration problems> Localize all path problems by testing progressively smaller sections of the full path.FYI: This path may pass with a less strenuous application: Try rate=4 Mb/s, rtt=106 ms Or if you can raise the MTU: Try rate=4 Mb/s, rtt=662 ms, mtu=9000Some events in this run were not completely diagnosed.
Key features • Results are specific and less technical • Provides a list of action items to be corrected • Provides enough detail for escalation • Eliminates false pass test results • Test becomes more sensitive on shorter paths • Conventional diagnostics become less sensitive • Depending on models, perhaps too sensitive • New problem is false fail • Flaws no longer mask other flaws • A single test often detects several flaws • They can be repaired in parallel
Some demos wget http://www.psc.edu/~mathis/src/diagnostic-client.c cc diagnostic-client.c -o diagnostic-client ./diagnostic-client kirana.psc.edu 70 90
Local server information • Current servers a single threaded • Silent wait if busy • Kirana.psc.edu • GigE attached directly to 3ROX • Outside the PSC firewall • Optimistic results to .61., .58. and .59. subnets • Scrubber.psc.edu • GigE attached in WEC • Interfaces on .65. and .66. subnets • Can be run on other Web100 systems • E.g. Application Gateways
The future • Collect (local) network pathologies • Raghu Reddy is coordinating • Keep archived data to improve the tool • Harden the diagnostic server • Widen testers to include attached campuses • 3ROX (3 Rivers Exchange) customers • CMU, Pitt, PSU, etc • Expect to find much more “interesting” pathologies • Replicate server at NCAR (FRGP) for their campuses
Related work • Also looking at finding flaws in applications • An entirely different set of techniques • But symptom scaling still applies • Provide LAN tools to emulate ideal long paths • Support local bench testing • For example classic ssh • Long known performance problems • Recently diagnosed to be due to internal flow control • Chris Rapier developed a patch • Already running on many PSC systems • See: http://www.psc.edu/networking/projects/hpn-ssh/