530 likes | 745 Views
Tools Overview. Richard Carlson January 30, 2007. Basic Premise. Application’s performance should meet your expectations! If they don’t you should complain!. Underlying Assumption. When problems exist, it’s the networks fault!. Simple Network Picture. Bob’s Host. Network Infrastructure.
E N D
Tools Overview Richard Carlson January 30, 2007
Basic Premise • Application’s performance should meet your expectations! • If they don’t you should complain!
Underlying Assumption • When problems exist, it’s the networks fault!
Simple Network Picture Bob’s Host Network Infrastructure Carol’s Host
Switch 2 Switch 3 R5 R4 R8 R1 R3 R6 Switch 1 R9 R2 R7 Switch 4 Network Infrastructure
Tools, Tools, Tools • AMP • Advisor • Thrulay • Web100 • MonaLisa • pathchar • NPAD • Pathdiag • Surveyor • Ethereal • CoralReef • MRTG • Skitter • Cflowd • Cricket • Net100 • Ping • Traceroute • Iperf • Tcpdump • Tcptrace • BWCTL • NDT • OWAMP
Basic Connectivity Tests • Ping • Confirms that remote host is ‘up’ • Some network operators block these packets • Traceroute • Identifies the routers along the path • Same blocking problem as above • Routers treat TR packets with lower priority
“ping” results • Intro message • Identifies remote host name and IP address • States size of packets being sent • Setting larger sizes may reveal hidden problems • Output lines • Who responded, and the RTT, maybe other details • Summary lines • Number of packets sent/received/lost • RTT statistics min/average/max Note: 1 msec RTT = 50 miles of between hosts
“traceroute” results • Intro messages • Name and address of remote host • Maximum number of link before giving up • Status messages • One line per router in path • ‘*’ indicates router didn’t respond • Routers usually rate limit replies • No name indicates DNS entry is missing • Hops required to reach remote host or max number from above
Advanced user tools • Existing NDT tool • Allows users to test network path for a limited number of common problems • Existing NPAD tool • Allows users to test local network infrastructure while simulating a long path
Network Diagnostic Tool (NDT) • Measure performance to users desktop • Identify real problems for real users • Network infrastructure is the problem • Host tuning issues are the problem • Make tool simple to use and understand • Make tool useful for users and network administrators
NDT user interface • Web-based JAVA applet allows testing from any browser • Command-line client allows testing from remote login shell
Finding Results of Interest • Duplex Mismatch • This is a serious error and nothing will work right. Reported on main page, on Statistics page, and mismatch: on More Details page • Packet Arrival Order • Inferred value based on TCP operation. Reported on Statistics page, (with loss statistics) and order: value on More Details page
Finding Results of Interest • Packet Loss Rates • Calculated value based on TCP operation. Reported on Statistics page, (with out-of-order statistics) and loss: value on More Details page • Path Bottleneck Capacity • Measured value based on TCP operation. Reported on main page
Finding a Server • What? You don’t have one running at your site? • Install the Internet2 Network Performance Toolkit Knoppix Disk
NPAD/pathdiag • A new tool from researchers at Pittsburgh Supercomputer Center • Finds problems that affect long network paths • Uses Web100-enhanced Linux based server • Web based Java client
Long Path Problem • E2E application performance is dependant on distance between hosts • Full size frame time at 100 Mbps • Frame = 1500 Bytes • Time = 0.12 msec • In flight for 1 msec RTT = 8 packets • In flight for 70 msec RTT = 583 packets
Switch 2 Switch 3 R5 R4 R8 R1 R3 R6 Switch 1 R9 R2 R7 Switch 4 Long Path Problem H2 70 msec H1 – H3 1 msec H1 – H2 X H3 H1
TCP Congestion Avoidance • Cut number of packets by ½ • Increase by 1 per RTT • LAN (RTT=1msec) • In flight changes to 4 packets • Time to increase back to 8 is 4msec • WAN (RTT = 70 msec) • In flight changes to 292 packets • Time to increase back to 583 is 20.4 seconds
npad results - 1 Data rate test: Pass! Pass data rate check: maximum data rate was 8.969226 Mb/s • Maximum data rate this link did achieve
Npad results - 2 Loss rate test: Pass! Pass: measured loss rate 0.035214% (2839 packets between loss events). • Loss caused by TCP overdriving the path FYI: To get 7 Mb/s with a 1460 byte MSS on a 22 ms path the total end-to-end loss budget is 0.282486% (354 packets between losses). • Worst case scenario for packet loss
Npad results - 3 Suggestions for alternate tests FYI: This path may even pass with a more strenuous application: Try rate=7 Mb/s, rtt=62 ms Try rate=8 Mb/s, rtt=48 ms Or if you can raise the MTU: Try rate=7 Mb/s, rtt=383 ms, mtu=9000 bytes Try rate=8 Mb/s, rtt=299 ms, mtu=9000 bytes • Helpful hints, you might do better or go farther at this rate
Npad results - 4 Network buffering test: Pass! Pass: The network bottleneck has sufficient buffering (queue space) in routers and switches. Measured queue size, Pkts: 36 Bytes: 52560 This corresponds to a 48.333600 ms drain time. To get 7 Mb/s with on a 22 ms path, you need 19250 bytes of buffer space. • Report on Network devices in path. A cheap switch would show up as something with small buffers • ‘[?]’ character indicates clickable icon for more details
Finding a Server • What? You don’t have one running at your site? • Install the Internet2 Network Performance Toolkit Knoppix Disk
Network Admin Tools • BWCTL – Bandwidth Control • Allows single person operation over wide area testing environment • Runs NLANR ‘iperf’ program • OWAMP – One way Delay Measurement • Advanced ‘ping’ command • Allows single person operation over wide area testing environment
BWCTL Highlights • You must pre-install BWCTL software package • New Internet2 default permits basic TCP test from any member • Sites can restrict access to ‘known’ remote users
Using BWCTL: commands bwctl -L90 -i2 -t20 -c bwctl.losa.net.internet2.edu bwctl -L90 -i2 -t20 -s bwctl.newy.net.internet2.edu • Bwctl = name of program • L90 = wait up to 90 seconds for a test • i2 = report intermediate results every 2 seconds • t20 – run test for 20 seconds • s name = remote end will send data to you • c name = you will send data to the remote host
3rd party testing: command bwctl -L90 -i2 -t20 -c bwctl.salt.net.internet2.edu -s bwctl.atla.net.internet2.edu • User can run a test between 2 remote hosts
Finding a Server • What? You don’t have one running at your site? • Install the Internet2 Network Performance Toolkit Knoppix Disk
OWAMP Highlights • You must pre-install OWAMP software package • User program is called ‘owping’ • New Internet2 default permits basic test from any member • Sites can restrict access to ‘known’ remote users
Using OWPING • owping owamp.salt.net.internet2.edu • owping = program name • owamp… = name of server • Output results • Separate statistics for both directions • Number of packets sent and lost • One-way delay statistics min/median/max • Number of IP hops in path • Number of packets that arrives out-of-order
Finding a Server • What? You don’t have one running at your site? • Install the Internet2 Network Performance Toolkit Knoppix Disk
Under Development • Emerging PerfSonar tool • Allows users to retrieve network path data from major national and international REN network
PerfSonar – Next Steps in Performance Monitoring • New Initiative involving multiple partners • ESnet (DOE labs) • GEANT (European Research and Education network) • Internet2 (Abilene and connectors)
PerfSONAR Services • Measurement Archive (MA) • Measurement Point (MP) • Lookup Service (LS) • Topology Service (TS) • Authentication Service (AS)
PerfSonar – Router stats on a path • Demo ESnet tool https://performance.es.net/cgi-bin/perfsonar-trace.cgi Paste output from Traceroute into the window and view the MRTG graphs for the routers in the path Author: Joe Metzger ESnet
Google it! • Enter “tuning tcp” into the google search engine. • Top 2 hits are: http://www.psc.edu/networking/perf_tune.html http://www-didc.lbl.gov/TCP-tuning/TCP-tuning.html