1 / 58

ESnet NMTF/NMFG - Status

ESnet NMTF/NMFG - Status. Les Cottrell, SLAC & Dave Martin, HEPNRC < cottrell@slac.stanford.edu >, < dem@hep.net > Presented at the ESCC Meeting, JLAB , Oct 1997. Outline of Talk. What happened to the NMTF/NMFG? What are we measuring? How are we measuring?

edan
Download Presentation

ESnet NMTF/NMFG - Status

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ESnet NMTF/NMFG - Status Les Cottrell, SLAC & Dave Martin, HEPNRC <cottrell@slac.stanford.edu>, <dem@hep.net> Presented at the ESCC Meeting, JLAB, Oct 1997 /afs/slac/u/sf/cottrell/talk/escc/oct97

  2. Outline of Talk • What happened to the NMTF/NMFG? • What are we measuring? • How are we measuring? • Tools we are using/developing • Coordination with others • Next Steps • Summary /afs/slac/u/sf/cottrell/talk/escc/oct97

  3. What happened to the NMTF/NMFG? • It evolved • Some of original members (BNL & ORNL) were unable to continue effort • SLAC& HEPNRC retained focus on monitoring • ICFA concerned about impact of network performance on HENP research • Created NTF with various WG, one on Monitoring • More focus on HENP issues and International links • Embraced work done by NMTF/NMFG and supported continued development • Brought in new partners, in particular INFN, CERN as well as other collection sites /afs/slac/u/sf/cottrell/talk/escc/oct97

  4. Mission etc. of the ICFA-NTF WG on Monitoring • Mission of Group: • Obtain as uniform picture as possible of the present performance of the connectivity used by the ICFA community • Two meetings so far, CHEP97 (Apr-97), & Santa Fe (Sep-97) • Produced an interim status report for Sep-97 • Will update for Dec-97, with a final report Apr-98. /afs/slac/u/sf/cottrell/talk/escc/oct97

  5. Our Main Metric is Ping • “Universally available”, easy to understand • no software for clients to install • Low network impact • Provides loss, response time, reachability, unpredictability • select hosts carefully, concerns over routers, loaded hosts etc. (provide guidelines) • does provide useful measures /afs/slac/u/sf/cottrell/talk/escc/oct97

  6. Ping Response Time vs Bytes /afs/slac/u/sf/cottrell/talk/escc/oct97

  7. Ping Response vs Web Response HTTP GET Response (ms) Minimum Ping Response (ms) /afs/slac/u/sf/cottrell/talk/escc/oct97

  8. Method • Measurement • Each Collection site keeps list of remote hosts to ping at sites it is interested in • Every 30 mins ping each remote host with 11 * 100 byte followed by 10 * 1000 byte pings • Min separation of pings is 1 second, timeout 20 seconds • Throw away first ping • Measure response, packet loss, host unreachable (no answer to any ping) • Record data and make available /afs/slac/u/sf/cottrell/talk/escc/oct97

  9. Architecture • Three Types of Sites • Remote Sites - need only to respond to ping packets • Collecting Sites • Collecting Data: Perl Script Pings Nodes, Records Data in common documented format • Serving Data: CGI/Perl Script makes Data Available to Analysis Sites • WWW CGI tools make reports available • Analysis Sites • Retrieving Data: Perl Script Retrieves Data from Collecting Sites • Analysis: SAS Program Analyzes Data and Generates Graphs • Reports: WWW Form Makes Customized Reports Available /afs/slac/u/sf/cottrell/talk/escc/oct97

  10. Architecture HTTP WWW Reports & Data E.g. HEPNRC E.g. SLAC Analysis Analysis Archive Collecting Collecting Collecting Collecting Pings Remote Cache Remote Remote Remote /afs/slac/u/sf/cottrell/talk/escc/oct97

  11. Available Tools - Data Collection • Collect data (timeping) • HEPNRC rearchitected, developed & documented • Deployed at 12 sites in 6 countries • ARM, BNL, CERN, CMU, DoE/GMTN, HEPNRC/FNAL, INFN/CNAF. KEK, Hungary, RAL, SLAC, UMD • DESY, IN2P3, TRIUMF, MSU, Beijing also expressed interest, plus commercial sites • Data available (pingdata) in common format • Data collected available from collection site via HTTP • Allows data for specific times to be retrieved /afs/slac/u/sf/cottrell/talk/escc/oct97

  12. Current Deployment CERN DESY KEK HEPNRC/FNAL RAL CMU SLAC BNL RMKI/KFKI UMD INFN/CNAF Monitoring Site ESnet Site (monitored from SLAC) N. American Site ( “ “ ) International Site ( “ “ ) /afs/slac/u/sf/cottrell/talk/escc/oct97

  13. Analysis / Archive Site • Gathers & archives data • HEPNRC gathers data from collection sites a few times daily • Archives the data (200 Mbytes/month) • Works with collection sites to resolve problems • Provide Web access to archive data via form (ping_data.pl) /afs/slac/u/sf/cottrell/talk/escc/oct97

  14. Access to Raw Data /afs/slac/u/sf/cottrell/talk/escc/oct97

  15. Analysis / Archive Site • Gathers & archives data • HEPNRC gathers data from collection sites a few times daily • Archives the data (200 Mbytes/month) • Works with collection sites to resolve problems • Provide Web access to archive data via form (ping_data.pl) • Provide Web form to allow simple plotting (graph_pings.pl), uses SAS for speed /afs/slac/u/sf/cottrell/talk/escc/oct97

  16. Form to Select Analysis Graphs /afs/slac/u/sf/cottrell/talk/escc/oct97

  17. /afs/slac/u/sf/cottrell/talk/escc/oct97

  18. Analysis Tools for Collection Sites • Short-term analysis / reports • Recent data (e.g. last 30 days cached) • Web sortable table of latest measurements, colored for quality /afs/slac/u/sf/cottrell/talk/escc/oct97

  19. Ping Loss Quality 0 -1% Good, 1-5% Acceptable, 5-12% Poor, 12-25% Poor, > 25% Unusable Similar to Internet Weather Report (<6%, <12%, > 12%) /afs/slac/u/sf/cottrell/talk/escc/oct97

  20. Analysis Tools for Collection Sites • Short-term analysis / reports • Recent data (e.g. last 30 days cached) • Web sortable table of latest measurements, colored for quality, with output (TSV) for Excel (connectivity.pl) /afs/slac/u/sf/cottrell/talk/escc/oct97

  21. Latest Ping Measurements /afs/slac/u/sf/cottrell/talk/escc/oct97

  22. Raw Data from last 24 Hours /afs/slac/u/sf/cottrell/talk/escc/oct97

  23. Latest Ping Measurements /afs/slac/u/sf/cottrell/talk/escc/oct97

  24. Ping Performance for Last 180 Days /afs/slac/u/sf/cottrell/talk/escc/oct97

  25. Analysis Tools for Collection Sites • Short-term analysis / reports • Recent data (e.g. last 30 days cached) • Web sortable table of latest measurements, colored for quality, with output (TSV) for Excel (connectivity.pl) • Web form to select sites and time frames to be plotted (ping_data_plot.pl) /afs/slac/u/sf/cottrell/talk/escc/oct97

  26. Request Plot of Collection Site Data /afs/slac/u/sf/cottrell/talk/escc/oct97

  27. Plot from Collection Site /afs/slac/u/sf/cottrell/talk/escc/oct97

  28. Tools in Development • Re-engineering SLAC long term reports • exception report /afs/slac/u/sf/cottrell/talk/escc/oct97

  29. Exception Reports Click to sort by column Click here to burrow down to more information Color highlights extent of exception Last 10 Weeks Ping Data /afs/slac/u/sf/cottrell/talk/escc/oct97

  30. Tools in Development • Re-engineering SLAC long term reports • exception report • last 180 days /afs/slac/u/sf/cottrell/talk/escc/oct97

  31. 180 Days SLAC - Stanford Direct connect Via ESnet 5.5ms 20 ms 30ms Loss 3-6% Loss < 1% Uwave & Routing problems Feb-97 Aug-97 /afs/slac/u/sf/cottrell/talk/escc/oct97

  32. Tools in Development • Re-engineering SLAC long term reports • exception report • last 180 days • monthly points going back for years in tabular form with quality coloring, sorting & hyperlinks • Loss (by site, and by group of sites) • Response ( “ “ ) • Reachability ( “ “ ) • % time network “Quiescent” or “Busy” /afs/slac/u/sf/cottrell/talk/escc/oct97

  33. Ping Loss History /afs/slac/u/sf/cottrell/talk/escc/oct97

  34. TSV Output to Excel for Further Analysis /afs/slac/u/sf/cottrell/talk/escc/oct97

  35. Ping Response by Group /afs/slac/u/sf/cottrell/talk/escc/oct97

  36. Prime-time Packet Loss by Group /afs/slac/u/sf/cottrell/talk/escc/oct97

  37. “Quiescent” Frequency by Group /afs/slac/u/sf/cottrell/talk/escc/oct97

  38. International Site “Busy” Frequency UK - US link upgraded CERN & IN2P3 track RL.UK Italian nodes track & look good /afs/slac/u/sf/cottrell/talk/escc/oct97

  39. Tools in Development • Re-engineering SLAC long term reports • exception report • last 180 days • monthly points going back for years in tabular form with quality coloring, sorting & hyperlinks • Loss (by site, and by group of sites) • Response ( “ “ ) • Reachability ( “ “ ) • % time network “Quiescent” or “Busy” • Ten Worst links in HEP /afs/slac/u/sf/cottrell/talk/escc/oct97

  40. Ten Worst HEP Links Ranked by % Packets Lost /afs/slac/u/sf/cottrell/talk/escc/oct97

  41. What are Typical Uses • Setting Expectations • Service Level Contract • Choosing ISPs • Identifying problems, and verifying solutions • Planning for upgrades /afs/slac/u/sf/cottrell/talk/escc/oct97

  42. Summary to Help Choose Upgrades /afs/slac/u/sf/cottrell/talk/escc/oct97

  43. Prime Time Packet Loss Jun-Aug 97 /afs/slac/u/sf/cottrell/talk/escc/oct97

  44. Coordination etc. XIWT/IPWT Interest/deployment /afs/slac/u/sf/cottrell/talk/escc/oct97

  45. XIWT/IPWT interest Austin meeting in Sep-97 available tools presented by developers: IWR, CAIDA/NLANR, Intel, Auto Industry/Bellcore, IETF/IPPM Surveyor … XIWT/IPWT want to: Measure performance of members' own networks Get tests to validate and understand what to recommend to other commercial customers and for what purposes. Build a community within XIWT so can evolve it to address harder issues. Selected our tools to initially deploy at 6 sites includes Intel, SBC, HAI, BellSouth, CNRI, NIST /afs/slac/u/sf/cottrell/talk/escc/oct97

  46. Coordination etc. XIWT/IPWT Interest/deployment MICS funded joint SLAC/LBL proposal on Internet End-to-end performance monitoring for 1 year LBL/NIMI project /afs/slac/u/sf/cottrell/talk/escc/oct97

  47. NIMI (1) • NIMI=National Internet Measurement Infrastructure, collaboration LBL/PSC (V. Paxson, M Mathis, J. Mahdavi). • It is a software suite (not hardware). Deploy on “measurement hosts” around the Internet for black box infrastructure measurements. • Ready for deployment Nov-97. Perl daemon with treno, Poisson packet generation for loss & delays. • Hooks for other tools such as pathchar, tcpanaly. /afs/slac/u/sf/cottrell/talk/escc/oct97

  48. NIMI (2) • Challenges: accurate clock synchronization (one way measurements), scaling to millions of nimids (nb end-to-end measurement strategies are usually not cost free, some things may be over-measured), data retrieval, new measurement strategies. • There is no central management. • Both HEPNRC & SLAC plan to install NIMI hosts (PCs running FreeBSD) at their sites /afs/slac/u/sf/cottrell/talk/escc/oct97

  49. Coordination etc. XIWT/IPWT interest/deployment MICS funded joint SLAC/LBL proposal on Internet End-to-end performance monitoring for 1 year LBL/NIMI project Proposed joint work with NLANR to extend Mapnet Java tools to view our data /afs/slac/u/sf/cottrell/talk/escc/oct97

  50. NLANR Mapnet Tool • Java Applet • Zoom & pan • Select ISPs • Color: • ISP • bandwidth • Mouse over • link details • node details /afs/slac/u/sf/cottrell/talk/escc/oct97

More Related