1 / 45

Internet Monitoring - Results

Internet Monitoring - Results. Les Cottrell SLAC < cottrell@slac.stanford.edu> Presented at the ICFA Meeting, CERN, Mar 1998 Partially funded by MICS joint SLAC/LBL proposal on Internet End-to-end Performance Monitoring (IEPM). Outline of Talk.

eydie
Download Presentation

Internet Monitoring - Results

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Internet Monitoring - Results Les Cottrell SLAC <cottrell@slac.stanford.edu> Presented at the ICFA Meeting, CERN, Mar 1998 Partially funded by MICS joint SLAC/LBL proposal on Internet End-to-end Performance Monitoring (IEPM) \\pcbackup\users\cottrell\icfa\icfa-mar98.ppt

  2. Outline of Talk • What, why & how are we (ESnet/HENP community) measuring? • What PingER measurement reports are available and what do they show • (short), intermediate & long term • grouping and multi-site visualization • Traffic volume & Traceroute measurements • Summary • Deployment/development, Internet Performance, Next Steps • Collaborations • NIMI/IPWT \\pcbackup\users\cottrell\icfa\icfa-mar98.ppt

  3. Why go to the effort? • Apparent quality of Internet getting worse as size and demands increase • Internet woefully under-measured & under-instrumented • Internet very diverse - no single path typical • Users need: • realistic expectations, planning information • guidelines for setting and validating SLAs • information to help in identifying problems • help to decide where to apply resources \\pcbackup\users\cottrell\icfa\icfa-mar98.ppt

  4. Importance of Response Time • Time is scarcest and most valuable commodity • Studies in late 70’s and early 80s showed the economic value of Rapid Response Time • 0-0.4s High productivity interactive response • 0.4-2s Fully interactive regime • 2-12s Sporadically interactive regime • 12s-600s Break in contact regime • >600s Batch regime • Threshold around 4-5s complaints increase rapidly. • Voice has threshold around 100ms \\pcbackup\users\cottrell\icfa\icfa-mar98.ppt

  5. Perception of Poor Packet Loss • Above 4-6% packet loss video conferencing becomes irritating, and non native language speakers become unable to communicate. • The occurrence of long delays of 4 seconds or more at a frequency of 4-5% or more is also irritating for interactive activities such as telnet and X windows. • Above 10-12% packet loss there is an unacceptable level of back to back loss of packets and extremely long timeouts, connections start to get broken, and video conferencing is unusable. \\pcbackup\users\cottrell\icfa\icfa-mar98.ppt

  6. Our Main Metric is Ping • “Universally available”, easy to understand • no software for clients to install • Low network impact • Provides useful real world measures of loss, response time, reachability, unpredictability \\pcbackup\users\cottrell\icfa\icfa-mar98.ppt

  7. Ping Response vs Web Response 1/2 HTTP GET Response (ms) Minimum Ping Response (ms) \\pcbackup\users\cottrell\icfa\icfa-mar98.ppt

  8. Ping Response vs Web Response 2/2 \\pcbackup\users\cottrell\icfa\icfa-mar98.ppt

  9. Ranked packet loss for 3 months Stanford Rome UK Cincinnatti \\pcbackup\users\cottrell\icfa\icfa-mar98.ppt

  10. Sawtooth Effect 2 * capacity (+ 2Mbps) Added 45 Mbps (quadrupled capacity) 3 * capacity + 9 Mbps Holidays \\pcbackup\users\cottrell\icfa\icfa-mar98.ppt

  11. RAL Last 180 Days plot Lines are simply cubic splines fits to aid eye Upper green and black points are response time in ms Red & blue are weekday loss Cyan are weekend loss Note weekend/weekday differences (cyan vs blue) Note Xmas/New Year lull Also note quick onset of saturation at end August & September \\pcbackup\users\cottrell\icfa\icfa-mar98.ppt

  12. Italian sites look similar to each other \\pcbackup\users\cottrell\icfa\icfa-mar98.ppt

  13. Representative International HENP Site Loss Jan-95 thru Nov-97 • Note RL (UK) saw-tooths as add UK-US bandwidth (Apr-96, Feb-97, Aug-97) \\pcbackup\users\cottrell\icfa\icfa-mar98.ppt

  14. Aggregation • Group measurements, for example: • by area (e.g. N. America E, N. America E, W. Europe/Japan, others, by country) • trans-oceanic links, intercontinental links • separation e.g. number of hops, time zones crossed, IXPs crossed • ISP (ESnet, vBNS/I2, ...) • by monitoring site • one site seen from multiple sites • common interest/affiliation (XIWT, HENP …) • user selectable \\pcbackup\users\cottrell\icfa\icfa-mar98.ppt

  15. Group Selection (all sites monitoring CERN) Select one of these groups CMU CMU CNAF RL FNAL SLAC DESY DESY Carelton RMKI RMKI CERN KEK \\pcbackup\users\cottrell\icfa\icfa-mar98.ppt

  16. Group Response Time Jan-95 Nov-97 • Improved between 1 and 2.5% / month • Response & Loss similar improvements • care with new sites \\pcbackup\users\cottrell\icfa\icfa-mar98.ppt

  17. Network Quiescence • Frequency of zero packet loss (for all time - not cut on prime time) \\pcbackup\users\cottrell\icfa\icfa-mar98.ppt

  18. Ping Loss Quality • Want quick to grasp indicator of link quality • Loss is the most sensitive indicator • loss of packet requires ~ 4 sec TCP retry timeout • Studies on economic value of response time by IBM showed there is a threshold around 4-5secs where complaints increase. • 0-1% = Good 1-2.5% = Acceptable • 2.5%-5% = Poor 5%-12% = Very Poor • > 12% = Bad \\pcbackup\users\cottrell\icfa\icfa-mar98.ppt

  19. Quality Distributions • ESnet median good quality • All other groups poor or very poor • Critical to have good peering \\pcbackup\users\cottrell\icfa\icfa-mar98.ppt

  20. Multi Collection Site Visualization Collection Sites Remote Sites \\pcbackup\users\cottrell\icfa\icfa-mar98.ppt

  21. Intercontinental Grouping (Loss) • Move mouse over ? to see # links Looks pretty bad for intercontinental use \\pcbackup\users\cottrell\icfa\icfa-mar98.ppt

  22. Top Level Domain Grouping (Loss) Mouseover red dots gives more information on TLD (e.g. ch=Switzerland) Diagonals are within TLD \\pcbackup\users\cottrell\icfa\icfa-mar98.ppt

  23. TLD (Response Time) \\pcbackup\users\cottrell\icfa\icfa-mar98.ppt

  24. Grouping Details Select metric Select group Sort Color for quality Also provides Excel for DIY at bottom \\pcbackup\users\cottrell\icfa\icfa-mar98.ppt

  25. Recent Transoceanic trends \\pcbackup\users\cottrell\icfa\icfa-mar98.ppt

  26. By Monitoring Site \\pcbackup\users\cottrell\icfa\icfa-mar98.ppt

  27. CERN Monitoring TLDs \\pcbackup\users\cottrell\icfa\icfa-mar98.ppt

  28. ESnet bytes accepted by site for Jan ‘98 Exchanges LBL/ESnet \\pcbackup\users\cottrell\icfa\icfa-mar98.ppt

  29. US HENP Traffic Growth Exponential growth from 3-6% \\pcbackup\users\cottrell\icfa\icfa-mar98.ppt

  30. Multi Router Traffic Grapher (MRTG) CERN-US E1(2Mbps) link Added 2nd 2Mbps link \\pcbackup\users\cottrell\icfa\icfa-mar98.ppt

  31. Traffic Volume for Germany (DFN) DFN T1 Utilization 15 Jan ‘98 (5 min averages) Green = to US Blue = from US DFN T1 Utilization for 15 Jan ‘98 (5 min averages) # of 2 min periods in Dec-96 with peak utilization > y % From US # Samples \\pcbackup\users\cottrell\icfa\icfa-mar98.ppt To US

  32. Capacity/Load Ratios • Looking at the link capacity/average load • Most ESnet links show ratios of a few to several tens • The international links (CERN-Perryman (~4), DFN (~5), Italy (~4), KEK (~10), Canada (15)) show ratios of 4-15 • The worst link appears to be the MAE-W-ESnet link at about 1.5 ratio • However this may not be the bottleneck link \\pcbackup\users\cottrell\icfa\icfa-mar98.ppt

  33. Bottlenecks • Identification • Traceroute • from/to multiple sites can identify common path segments in the maps • Can see onset of losses with traceping • Pathchar can identify bottlenecks • Then need to work on: • avoiding bottlenecks (new peering) • getting bottleneck owners to improve • this is difficult, lots of potential bottlenecks, bottlenecks move, not under our control \\pcbackup\users\cottrell\icfa\icfa-mar98.ppt

  34. TracePing (Oxford) Muliple routes seen \\pcbackup\users\cottrell\icfa\icfa-mar98.ppt

  35. Traceroute From TRIUMF • Reverse traceroute servers • Traceping • TopologyMap • Ellipses show node on route • Open ellipse is measurement node • Blue ellipse no reachable • Keeping history \\pcbackup\users\cottrell\icfa\icfa-mar98.ppt

  36. GUI Traceroute (e.g. VisualRoute) \\pcbackup\users\cottrell\icfa\icfa-mar98.ppt

  37. Summary • Deployment Development • ESnet/HENP has 14 Collection sites in 8 countries collecting data on > 500 links involving 22 countries • XIWT/IPWT deployed ~ 10 collection sites using PingER tools • 600MB/month/link, 6 bps/link, .25 FTE @ analysis site, 1.5-2.5 FTE on analysis • HEPNRC gathering, archiving • Long term reports being ported to HEPNRC from SLAC • Long term analysis today usually requires tool like SAS \\pcbackup\users\cottrell\icfa\icfa-mar98.ppt

  38. Summary • Deployment Development • Internet Performance • Performance within ESnet is good • Performance between ESnet & other sites is poor to very poor on average • one of main causes is congestion points, so peering is critical • Intercontinental performance is very poor to bad • ESnet traffic accepted from major HENP labs growing by 3-6% per month • Response time improving by 1-2% / month • Packet loss improving between SLAC & other sites by 3% / month \\pcbackup\users\cottrell\icfa\icfa-mar98.ppt

  39. Summary • Deployment Development • Internet Performance (continued): • Links to sites outside N. America vary from good (KEK) to bad • Some of the bad sites are to be expected, e.g. FSU, China, Czeck Republic, some surprises such as UK • CERN, France, Germany acceptable to poor \\pcbackup\users\cottrell\icfa\icfa-mar98.ppt

  40. Summary • Deployment Development • Internet Performance • Next Steps • Improve tools: • Make long term reports at Analysis site available & understandable • Look into prediction (extrapolations, develop models, configure and validate with data) • Pursue IETF Surveyor & NIMI deployment \\pcbackup\users\cottrell\icfa\icfa-mar98.ppt

  41. National Internet Measurement Infrastructure (NIMI) • Secure, scalable infrastructure for scheduling monitoring, gathering data • Minimal amount of human intervention • Inexpensive probe built on PC FreeBSD platform • Dynamic - can add/modify measurement suites, initially includes: • Traceroute • TReno - measures bulk transfer thruput • Poip - one way ping \\pcbackup\users\cottrell\icfa\icfa-mar98.ppt

  42. Asymmetric One-way Delays 20% U Chicago to Advanced Advanced to U Chicago Loss Loss 0% 300ms Delay Delay 0ms 0 \\pcbackup\users\cottrell\icfa\icfa-mar98.ppt 24

  43. NIMI • Deployed at PSC, LBL, FNAL, platforms being configured at SLAC & CERN • As NIMI becomes more real will start to use as infrastructure for IPPM Surveyors • Security • allows full policy control over any box you own or delegation of all or subsets • uses ACLs with authentication for requests, and encryption to prevent sniffing \\pcbackup\users\cottrell\icfa\icfa-mar98.ppt

  44. Summary • Deployment Development • Internet Performance • Next Steps • Lots of collaboration: • SLAC & HEPNRC • 14 collection sites, ~ 400 remote sites • Collection site tools CERN & CNAF/ICFA • Oxford/TracePing • MapPing/MAPNet/NLANR • TRIUMF Traceroute topology Map • NIMI/LBNL & Surveyor/IETF • XIWT/IPWT • Talks at IETF, XIWT, ICFA, ESCC ... \\pcbackup\users\cottrell\icfa\icfa-mar98.ppt

  45. More Information • ICFA Monitoring WG home page (links to status report, meeting notes, how to access data, and code) • http://www.slac.stanford.edu/xorg/icfa/ntf/home.html • WAN Monitoring at SLAC has lots of links • http://www.slac.stanford.edu/comp/net/wan-mon.html • Tutorial on WAN Monitoring • http://www.slac.stanford.edu/comp/net/wan-mon/tutorial.html • MapPing Tool: • http://www.slac.stanford.edu/~warrenm/work/java/newjava/mapping.html • NIMI http://www.psc.edu/~mahdavi/nimi_paper/NIMI.html \\pcbackup\users\cottrell\icfa\icfa-mar98.ppt

More Related