360 likes | 487 Views
Pathdiag: Automatic TCP Diagnosis. Matt Mathis (PSC) John Heffner (PSC/Rinera) Peter O'Neil (NCAR/Mid-Atlantic Crossroads) Pete Siempsen (NCAR) 30 April 2008 http://staff.psc.edu/mathis/papers/ PAM20080430.ppt. Outline. What is the problem? The pathdiag solution Details
E N D
Pathdiag:Automatic TCP Diagnosis Matt Mathis (PSC) John Heffner (PSC/Rinera) Peter O'Neil (NCAR/Mid-Atlantic Crossroads) Pete Siempsen (NCAR) 30 April 2008 http://staff.psc.edu/mathis/papers/ PAM20080430.ppt
Outline • What is the problem? • The pathdiag solution • Details • The bigger problem
What is the problem? Internet 2 weekly traffic statistics – About 3 Mb/s!
Why is end-to-end performance difficult? • By design TCP/IP hides the ‘net from upper layers • TCP/IP provides basic reliable data delivery • The “hour glass” between applications and networks • This is a good thing, because it allows: • Invisible recovery from data loss, etc • Old applications to use new networks • New application to use old networks • But then (nearly) all problems have the same symptom • Less than expected performance • The details are hidden from nearly everyone
TCP tuning is painful debugging • All problems reduce performance • But the specific symptoms are hidden • Any one problem can prevent good performance • Completely masking all other problems • Trying to fix the weakest link of an invisible chain • General tendency is to guess and “fix” random parts • Repairs are sometimes “random walks” • Repair one problem at time at best • The solution is to instrument TCP
The Web100 project • Use TCP's ideal diagnostic vantage point • Instrument TCP: What is limiting the data rate? • RFC 4898 TCP-ESTATS-MIB • Standards track • Prototypes for Linux (www.Web100.org) and Windows Vista • Fix TCP's part of the problem: Autotuning • Automatically adjusts TCP socket buffers • Linux 2.6.17 default maximum window size is 4 M Bytes • Microsoft Vista default maximum window size is 8 M bytes • (Except IE) • Web100 is done • But still under limited support
New insight: symptoms scale with RTT • Example flaws: • TCP Buffer Space: • Packet loss: • Think: RTT in the denominator converts “rounds” to elapsed time.
Symptom scaling breaks diagnostics • Local Client to Server • Flaw has insignificant symptoms • All applications work, including all standard diagnostics • False pass for all diagnostic tests • Remote Client to Server: all applications fail • Leading to faulty implication of other components • It seems that the flaws are in the wide are network
The confounded problems • For nearly all network flaws • The only symptom is reduced performance • But the reduction is scaled by RTT • Therefore, flaws are undetectable on short paths • False pass for even the best conventional diagnostics • Leads to faulty inductive reasoning about flaw locations • Diagnosis often relies on tomography and complicated inference techniques • This is the real end-to-end performance problem
Goals • We want to automate debugging for “the masses” • But start with low hanging fruit • Who are the users? Assume: • Analytic (e.g. Non-network scientists) • Not afraid of math or measurements • Known data sources • Primary data direction is towards the users • That they have systems and network support • Only need to do first level diagnosis
More Goals • Automatic • “one click” in a web browser • Diagnose first level problems • Easily expose all path bottlenecks that limit performance to less than 10 MByte/s • Easily expose all end-system/OS problems that limit performance to less than 10 MByte/s • Will become moot as autotuning is deployed • Empower the users to apply the proper motivation • Results need to be accurate, well explained and common to both users and sys/net admins
The pathdiag solution • Test a short section of the path • Most often first or last mile • Use Web100 to collect detailed TCP statistics • Loss, delay, queuing properties, etc • Use models to extrapolate results to the full path • Assume that the rest of the path is ideal • The user has to specify the end-to-end goal • Data rate and RTT • Pass/Fail on the basis of the extrapolated performance
Deploy as a Diagnostic Server • Use pathdiag in a Diagnostic Server (DS) • Specify End to End target performance • From server (S) to client (C) (RTT and data rate) • Measure the performance from DS to C • Use Web100 in the DS to collect detailed statistics • On both the path and client • Extrapolate performance assuming ideal backbone • Pass/Fail on the basis of extrapolated performance
Demo • Click here for a live server
Key NPAD/pathdiag features • Results are intended to be self explanitory • Provides a list of specific items to be corrected • Failed tests are show stoppers for fast applications • Includes explanations and tutorial information • Clear differentiation between client and path problems • Accurate escalation to network or system admins • The reports are public and can be viewed by either • Coverage for a majority of OS and last-mile network flaws • Coverage is one way – need to reverse client and server • Does not test the application – need application tools • Does not check routing – need traceroute • Does not check for middleboxes (NATs etc). • Eliminates nearly all(?) false pass results
More features • Tests becomes more sensitive as the path gets shorter • Conventional diagnostics become less sensitive • Depending on models, perhaps too sensitive • New problem is false fail (e.g. queue space tests) • Flaws no longer completely mask other flaws • A single test often detects several flaws • E.g. Can find both OS and network flaws in the same run • They can be repaired concurrently • Archived DS results include raw web100 data [Sample] • Can reprocess with updated reporting SW • New reports from old data • Critical feedback for the NPAD project • We really want to collect “interesting” failures
Under the covers • Same base algorithm as “Windowed Ping” [Mathis, INET’94] • Aka “mping” • See http://www.psc.edu/~mathis/wping/ • Killer diagnostic in use at PSC in the early 90s • Stopped being useful with the advent of “fast path” routers • Use a simple fixed window protocol • Scan window size in 1 second steps • Pathdiag clamps cwnd to control the TCP window • Varies step size – fine steps near interesting features • Measure data rate, loss rate, RTT, etc as window changes • Reports reflect key features of the measured data
The Bigger Picture • Download and Install • http://www.psc.edu/networking/projects/pathdiag/ • The hardest part is building a Linux kernel • Beyond end-of-funding, still under limited support • Barriers to adoption • User expectations • Our language • Network administrators
Need to recalibrate user expectations • Long history of very poor network performance • Users do not know what to expect • Users have become completely numb • Users have no clue about how poorly they are doing • Because TCP/IP hides the network all too well • We need to re-educate R&E users: • Less than 1/2 gigabyte per minute is not highspeed • Everyone should be able to reach this rate • People who can’t should know why or be angry
Language problems • Nobody except network geeks use bits/second • BTW on the last slide: • 1/2 gigabyte/minute is about • 10 M Byte/s or • 80 Mb/s • 17 year old LAN technology (FDDI) • Nothing slower should be considered “High Speed”
Campus network administrators • Generally very underfunded, and know it • Can't support all users equally • Don't want users to compare results • Don't want to enable accurate user complaints • Don't want pathdiag • Workaround: deploy “upstream”
Closing • Satisfied our immediate technical goals • The bigger problem still requires a lot more work
What about impact of the test traffic? • Pathdiag server is single threaded • Only one test at a time • Same load as any well tuned TCP application • Protected by TCP “fairness” • Large flows are generally “softer” than small flows • Large flows are easily disturbed by small flows • Note that any short RTT flow is stiffer than a long RTT flow
NPAD/pathdiag deployment • Why should a campus networking organization care? • “Zero effort” solution to miss-tuned end-systems • Accurate reports of real problems • You have the same view as the user • Saves time when there really is a problem • You can document reality for management • Suggestion: • Require pathdiag reports for all performance problems
Download and install • User documentation: http://www.psc.edu/networking/projects/pathdiag/ • Follow the link to “Installing a Server” • Easily customized with a site specific skin • Designed to be easily upgraded with new releases • Roughly every 2 months • Improving reports through ongoing field experience • Drops into existing NDT servers • Plans for future integration • Enjoy!
The Wizard Gap Updated • Experts have topped out end systems & links • 10 Gb/s NIC bottleneck • 40 Gb/s “link” bandwidth (striped) • Median I2 bulk rate is 3 Mbit/s • See http://netflow.internet2.edu/weekly/ • Current Gap is about 3000:1 • Closing the first factor of 30 should now be “easy”
Pathdiag • Initial version aimed at “NSF domain scientists” • People with non-networking analytical background • Report designed to • accurately identify subsystem • provide tutorial • provide good escalation to network or host admin • support the user as the ultimate judge of success • Future plan to split reports • Even easier for non-experts • Better information for experts
Pathdiag • One click automatic performance diagnosis • Designed for (non-expert) end users • Future version will better support both expert and non-expert • Accurate end-systems and last mile diagnosis • Eliminate most false pass results • Accurate distinction between host and path flaws • Accurate and specific identification of most flaws • Basic networking tutorial info • Help the end user understand the problem • Help train 1st tier support (sysadmin or netadmin) • Backup documentation for support escalation • Empower the user to get it fixed • The same reports for users and admins