1 / 56

Internet2 E2E piPEs Project

Internet2 E2E piPEs Project. Eric L. Boyd. Internet2 E2E piPEs Goals. Enable end-users & network operators to: determine E2E performance capabilities locate E2E problems contact the right person to get an E2E problem resolved. Enable remote initiation of partial path performance tests

Download Presentation

Internet2 E2E piPEs Project

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Internet2 E2E piPEs Project Eric L. Boyd

  2. Internet2 E2E piPEs Goals • Enable end-users & network operators to: • determine E2E performance capabilities • locate E2E problems • contact the right person to get an E2E problem resolved. • Enable remote initiation of partial path performance tests • Make partial path performance data publicly available • Interoperable with other performance measurement frameworks

  3. Sample piPEs Deployment

  4. Project Phases • Phase 1: Tool Beacons • BWCTL (Complete), http://e2epi.internet2.edu/bwctl • OWAMP (Complete), http://e2epi.internet2.edu/owamp • NDT (Complete), http://e2epi.internet2.edu/ndt • Phase 2: Measurement Domain Support • General Measurement Infrastructure (Prototype) • Abilene Measurement Infrastructure Deployment (Complete), http://abilene.internet2.edu/observatory • Phase 3: Federation Support • AA (Prototype – optional AES key, policy file, limits file) • Discovery (Measurement Nodes, Databases) (Prototype – nearest NDT server, web page) • Test Request/Response Schema Support (Prototype – GGF NMWG Schema)

  5. piPEs Deployment

  6. NDT (Rich Carlson) • Network Diagnostic Tester • Developed at Argonne National Lab • Ongoing integration into piPEs framework • Redirects from well-known host to “nearest” measurement node • Detects common performance problems in the “first mile” (edge to campus DMZ) • In deployment on Abilene: • http://ndt-seattle.abilene.ucaid.edu:7123

  7. NDT Milestones • New Features added • Configuration file support • Scheduling/queuing support • Simple server discovery protocol • Federation mode support • Command line client support • Open Source Shared Development • http://sourceforge.net/projects/ndt/

  8. NDT Future Directions • Focus on improving problem detection algorithms • Duplex mismatch • Link detection • Complete deployment in Abilene POPs • Expand deployment into University campus/GigaPoP networks

  9. How Can you Participate? • Set up BWCTL, OWAMP, NDT Beacons • Set up a measurement domain • Place tool beacons “intelligently” • Determine locations • Determine policy • Determine limits • “Register” beacons • Install piPEs software • Run regularly scheduled tests • Store performance data • Make performance data available via web service • Make visualization CGIs available • Solve Problems / Alert us to Case Studies

  10. Example piPEs Use Cases • Edge-to-Middle (On-Demand) • Automatic 2-Ended Test Set-up • Middle-to-Middle (Regularly Scheduled) • Raw Data feeds for 3rd-Party Analysis Tools • http://vinci.cacr.caltech.edu:8080/ • Quality Control of Network Infrastructure • Edge-to-Edge (Regularly Scheduled) • Quality Control of Application Communities • Edge-to-Campus DMZ (On-Demand) • Coupled with Regularly Scheduled Middle-to-Middle • End User determines who to contact about performance problem, armed with proof

  11. Test from the Edge to the Middle Divide and conquer: Partial Path Analysis • Install OWAMP and / or BWCTL • Where are the nodes? • http://e2epi.internet2.edu/pipes/pmp/pmp-dir.html • Begin testing!: • http://e2epi.internet2.edu/pipes/ami/bwctl/ • Key Required • http://e2epi.internet2.edu/pipes/ami/owamp/ • No Key Required

  12. Example piPEs Use Cases • Edge-to-Middle (On-Demand) • Automatic 2-Ended Test Set-up • Middle-to-Middle (Regularly Scheduled) • Raw Data feeds for 3rd-Party Analysis Tools • http://vinci.cacr.caltech.edu:8080/ • Quality Control of Network Infrastructure • Edge-to-Edge (Regularly Scheduled) • Quality Control of Application Communities • Edge-to-Campus DMZ (On-Demand) • Coupled with Regularly Scheduled Middle-to-Middle • End User determines who to contact about performance problem, armed with proof

  13. Abilene Measurement Domain • Part of the Abilene Observatory: http://abilene.internet2.edu/observatory • Regularly scheduled OWAMP (1-way latency) and BWCTL/Iperf (Throughput, Loss, Jitter) Tests • Web pages displaying: • Latest results http://abilene.internet2.edu/ami/bwctl_status.cgi/TCP/now “Weathermap” http://abilene.internet2.edu/ami/bwctl_status_map.cgi/TCP/now • Worst 10 Performing Links http://abilene.internet2.edu/ami/bwctl_worst_case.cgi/TCP/now • Data available via web service: http://abilene.internet2.edu/ami/webservices.html

  14. Quality Control of Abilene Measurement Infrastructure (1) • Problem Solving Approach • Ongoing measurements start detecting a problem • Ad-hoc measurements used for problem diagnosis • On-going Measurements • Expect Gbps flows on Abilene • Stock TCP stack (albeit tuned) • Very sensitive to loss • “Canary in a coal mine” • Web100 just deployed for additional reporting • Skeptical eye • Apparent problem could reflect interface contention

  15. Quality Control of Abilene Measurement Infrastructure (2) • Regularly Scheduled Tests • Track TCP and UDP Flows (BWCTL/Iperf) • Track One-way Delays (OWAMP) • IPv4 and IPv6 • Observe: • Worst 10 TCP flows • First percentile TCP flow • Fiftieth percentile TCP flow • What percentile breaks 900 Mbps threshold • General Conclusions: • On Abilene, IPv4 and IPv6 statistically indistinguishable • Consistently low values to one host or across one path indicates a problem

  16. A (Good) Day in the Life of Abilene

  17. First two weeks in March 50th percentile right at 980 Mb/s 1st percentile about 900 Mb/s Take it as a baseline.

  18. Beware the Ides of March 1st percentile down to 522 Mb/s Circuit problems along west coast. nb: 50th percentile very robust.

  19. Recovery – sort of; life through 29 April 1st percentile back up to mid-800s, lower and shakier. nb: 50th percentile still very robust.

  20. Ah, sudden improvement through 5-May 1st percentile back up above 900 Mb/s and more stable. But why??

  21. Then, while Matt Z is tearing up the tracks 1st percentile back down to the 500s. Diagnosis: something is killing Seattle. Oh, and Sunnyvale is off the air.

  22. Matt fixes Sunnyvale, and things get (slightly) worse: both Seattle and Sunnyvale are bad. 1st percentile right at 500 Mb/s. Diagnosis: web100 interaction.

  23. Matt fixes the web100 interaction. 1st percentile cruising through 700 Mb/s. Life is good.

  24. Friday the (almost) 13th; JUNOS upgrade induces packet loss for about four hours along many links. 1st percentile falls to 63 Mb/s. Long-distance paths chiefly impacted.

  25. A “Known” Problem • Mid-May: routers all got a new software load to enable a new feature • Everything seemed to come up, but on some links, utilization did not rebound • Worst-10 reflected very low performance across those links • QoS parameter configuration format change…

  26. Nice weekend. 1st percentile rises to 968 Mb/s. But why??

  27. We Found It First • Streams over SNVA-LOSA link all showed problems • NOC responded: Found errors on SNVA-LOSA link • (NOC is now tracking errors more closely…) • Live (URL subject to change): http://abilene.internet2.edu/ami/bwctl_percentile.cgi/TCPV4/1/50/14118254811367342080_14169839516075950080

  28. Example piPEs Use Cases • Edge-to-Middle (On-Demand) • Automatic 2-Ended Test Set-up • Middle-to-Middle (Regularly Scheduled) • Raw Data feeds for 3rd-Party Analysis Tools • http://vinci.cacr.caltech.edu:8080/ • Quality Control of Network Infrastructure • Edge-to-Edge (Regularly Scheduled) • Quality Control of Application Communities • ESNet / ITECs (3+3) [See Joe Metzger’s talk to follow] • eVLBI • Edge-to-Campus DMZ (On-Demand) • Coupled with Regularly Scheduled Middle-to-Middle • End User determines who to contact about performance problem, armed with proof

  29. Example Application Community: VLBI (1) • Very-Long-Baseline Interferometry (VLBI) is a high-resolution imaging technique used in radio astronomy. • VLBI techniques involve using multiple radio telescopes simultaneously in an array to record data, which is then stored on magnetic tape and shipped to a central processing site for analysis. • Goal: Using high-bandwidth networks, electronic transmission of VLBI data (known as “e-VLBI”).

  30. Example Application Community: VLBI (2) • Haystack <-> Onsala • Abilene, Eurolink, GEANT, NorduNet, SUNET • User: David Lapsley, Alan Whitney • Constraints • Lack of administrative access (needed for Iperf) • Heavily scheduled, limited windows for testing • Problem • Insufficient performance • Partial Path Analysis with BWCTL/Iperf • Isolated packet loss to local congestion in Haystack area • Upgraded bottleneck link

  31. Example Application Community: VLBI (3) • Result • First demonstration of real-time, simultaneous correlation of data from two antennas (32 Mbps, work continues) • Future • Optimize time-of-day for non-real-time data transfers • Deploy BWCTL at 3 more sites beyond Haystack, Onsala, and Kashima

  32. TSEV8 Experiment • Intensive experiment • Data • 18 scans, 13.9 GB of data • Antennas: • Westford, MA and Kashima, Japan • Network: • Haystack, MA to Kashima, Japan • Initially, 100 Mbps commodity Internet at each end, Kashima link upgraded to 1 Gbps just prior to experiment

  33. TSEV8 e-VLBI Network

  34. Network Issues • In week leading up to experiment, network showed extremely poor throughput ~ 1 Mbps! • Network analysis/troubleshooting required: • Traditionally: pair-wise iperf testing between hosts along transfer path, step-by-step tracing of link utilization via Internet2/Transpac-APAN network monitoring websites: • Time consuming, error prone, not conclusive • New approach: automated iperf-testing using Internet2’s bwctl tool (allows partial path analysis), one single website to integrate link utilization statistics into one single website • No maintenance required once setup, for the first time an overall view of the network and bandwidth on segment-by-segment basis

  35. E-VLBI Network Monitoring http://web.haystack.mit.edu/staff/dlapsley/tsev7.html

  36. E-VLBI Network Monitoring http://web.haystack.mit.edu/staff/dlapsley/tsev7.html

  37. E-VLBI Network Monitoring • Use of centralized/integrated network monitoring helped to enable identification of bottleneck (hardware fault) • Automated monitoring allows view of network throughput variation over time • Highlights route changes, network outages • Automated monitoring also helps to highlight any throughput issues at end points: • E.g. Network Inteface Card failures, Untuned TCP Stacks • Integrated monitoring provides overall view of network behavior at a glance

  38. Result • Successful UT1 experiment completed June 30 2004. • New record time for transfer and calculation of UT1 offset: • 4.5 hours (down from 21 hours)

  39. Acknowledgements • Yasuhiro Koyama, Masaki Hirabaru and colleagues at National Institute for Information and Communications Technology • Brian Corey, Mike Poirier and colleagues from MIT Haysack Observatory • Internet2, TransPAC/APAN, JGN2 networks • Staff at APAN Tokyo XP • Tom Lehman - University of Southern California - Information Sciences Institute East

  40. Example piPEs Use Cases • Edge-to-Middle (On-Demand) • Automatic 2-Ended Test Set-up • Middle-to-Middle (Regularly Scheduled) • Raw Data feeds for 3rd-Party Analysis Tools • http://vinci.cacr.caltech.edu:8080/ • Quality Control of Network Infrastructure • Edge-to-Edge (Regularly Scheduled) • Quality Control of Application Communities • Edge-to-Campus DMZ (On-Demand) • Coupled with Regularly Scheduled Middle-to-Middle • End User determines who to contact about performance problem, armed with proof

  41. How Can you Participate? • Set up BWCTL, OWAMP, NDT Beacons • Set up a measurement domain • Place tool beacons “intelligently” • Determine locations • Determine policy • Determine limits • “Register” beacons • Install piPEs software • Run regularly scheduled tests • Store performance data • Make performance data available via web service • Make visualization CGIs available • Solve Problems / Alert us to Case Studies

  42. Extra Slides

  43. American / European Collaboration Goals • Awareness of ongoing Measurement Framework Efforts / Sharing of Ideas (Good / Not Sufficient) • Interoperable Measurement Frameworks (Minimum) • Common means of data extraction • Partial path analysis possible along transatlantic paths • Open Source Shared Development (Possibility, In Whole or In Part) • End-to-end partial path analysis for transatlantic research communities • VLBI: Haystack, Mass.  Onsala, Sweden • HENP: Caltech, Calif. CERN, Switzerland

  44. American / European Collaboration Achievements • UCL E2E Monitoring Workshop 2003 • http://people.internet2.edu/~eboyd/ucl_workshop.html • Transatlantic Performance Monitoring Workshop 2004 • http://people.internet2.edu/~eboyd/transatlantic_workshop.html • Caltech <-> CERN Demo • Haystack, USA <-> Onsala, Sweden • piPEs Software Evaluation (In Progress) • Architecture Reconciliation (In Progress)

  45. Example Application Community: ESnet / Abilene (1) • 3+3 Group • US Govt. Labs: LBL, FNAL, BNL • Universities: NC State, OSU, SDSC • http://measurement.es.net/ • Observed: • 400 usec 1-way Latency Jump • Noticed by Joe Metzger • Detected: • Circuit connecting router in the CentaurLab to the NCNI edge router moved to a different path on metro DWDM system • 60 km optical distance increase • Confirmed by John Moore

  46. Example Application Community: ESnet / Abilene (2)

  47. American/European Demonstration Goals • Demonstrate ability to do partial path analysis between “Caltech” (Los Angeles Abilene router) and CERN. • Demonstrate ability to do partial path analysis involving nodes in the GEANT network. • Compare and contrast measurement of a “lightpath” versus a normal IP path. • Demonstrate interoperability of piPEs and analysis tools such as Advisor and MonALISA

More Related