490 likes | 594 Views
DICE: Performance Update. Eric L. Boyd (Internet2) Joe Metzger (ESnet) Nicolas Simar (G2 – JRA1). Vision: Performance Information is …. Available People can find it (Discovery) “Community of trust” allows access across administrative domain boundaries (AA) Ubiquitous
E N D
DICE: Performance Update Eric L. Boyd (Internet2) Joe Metzger (ESnet) Nicolas Simar (G2 – JRA1)
Vision: Performance Information is … • Available • People can find it (Discovery) • “Community of trust” allows access across administrative domain boundaries (AA) • Ubiquitous • Widely deployed (Paths of interest covered) • Reliable (Consistently configured correctly) • Valuable • Actionable (Analysis suggests course of action) • Automatable (Applications act on data)
Getting There: Build & Empower the Community Decouple the Problem Space: • Analysis and Visualization • Performance Data Sharing • Performance Data Generation Grow the Footprint: • Clean APIs and protocols between each layer • Widespread deployment of measurement infrastructure • Widespread deployment of common performance measurement tools
perfSONAR is a joint effort: ESnet Fermilab GÉANT2 JRA1 Internet2 RNP Internet2 includes: University of Delaware Georgia Tech Internet2 staff GÉANT2 JRA1 includes: Arnes Belnet Carnet Cesnet DANTE DFN FCCN GRNet GARR ISTF PSNC Nordunet (Uninett) Renater RedIRIS Surfnet SWITCH perfSONAR Credits
perfSONAR: Project Activity Meter • Interactions • 1-2 conf calls/week • 1 new service/month (accelerating) • 3-4 development workshops/year • 3-4 paper submissions/year • Recruitment • RNP has joined the effort • Outreach to LHC community • GaTech beginning six month commitment
perfSONAR: Services (1) • Measurement Point Service • Enables the initiation of performance tests • Measurement Archive Service • Stores performance monitoring results • Lookup Service • Allows the client to discover the existing services and other LS services. • Dynamic: services registration themselves to the LS and mention their capabilities, they can also leave or be removed if a service gets down. • AuthN/Z Service • Internet2 MAT, GN2-JRA5 (eduGAIN) • Authorization functionality for the framework • Users can have several roles, the authorisation is done based on the user role. • Trust relationships defined between users affiliated with different administrative domains.
perfSONAR Services (2) • Transformation Service • Transform the data (aggregation, concatenation, correlation, translation, etc). • Topology Service • Make the network topology information available to the framework. • Find the closest MP, provide topology information for visualisation tools • Resource protector • Arbitrate the consumption of limited resources between multiple services.
Types of perfSONAR Services • Core Services • Set released by perfSONAR Team • e.g. LS, AA, 3 MPs, 2 MAs, RP, Tos, TS • Tested for interoperability • Serve as examples for affiliated developers • Targeted at next generation network needs (e.g. GÉANT2, Internet2 New Network, etc.) • Affiliated Services • Released by perfSONAR partners, lag Core • May share development infrastructure (Bugzilla, Website, Mailing Lists) • Candidates for migration to Core Services • Unaffiliated Services
perfSONAR: Core Status Update • Production release of core services package v1.0 ready (pending licensing completion) • Core services include: • Single domain LS solution (PSNC) • RRD MA (PSNC) • Affiliate services and client applications supporting this version will soon follow: • BWCTL MP (DFN) • perfSONAR UI (ISTF) • Ongoing work • AA Design (Internet2, JRA1, JRA5) • Multi-LS (PSNC, RNP, UDel) • ToS (DFN, UDel)
perfSONAR Process Status Update • We have processes … ;-) • Release management process implemented (Internet2, RedIRIS, UDel) • Bugzilla up and running (UDel) • Migrated from CVS to SVN (Internet2) • Functional testing under construction (GRnet) • Monitoring deployed services with Tomcat (ISTF) • Installation process eased significantly (DANTE, PSNC, UDel) • www.perfsonar.net under development (Internet2, Renater) • Development information will stay on the Wiki • Adopter information will migrate to website
Affiliated Services Command Line Interface MP (Ping, OWAMP, Traceroute) (RNP, released) BWCTL MP (DFN, released) SQL MA (PSNC, released) L2-specific MA (DANTE) SSH MP (Looking Glass) (Belnet, released) ABW MP (bandwidth packet capture cards) (Cesnet) NMS MP (SDH status) (DANTE) Hades MA (OWD, Jitter, OWPL) (DFN) Flow Replicator MA (Surfnet, Carnet) User Interfaces CNM (DFN) perfSONAR UI (ISTF) Visual PerfSONAR (Carnet) Looking Glass (Belnet) ICE/NeTraMet (RNP) perfSONAR: Affiliate Status Update
What You See Is What You Get • perfsonarUI • Retrieval of published data • RRD MA • Hades MA • Visualisation of OWD, IPDV and packet loss between Hades MP • Parsing of arbitrary IPv4 or IPv6 traceroute commands • CNM – map based • GEANT2 + NRENS maps • VisualperfSONAR • Looking Glass
RRD MA features • Wrapper around RRD tool. • Request/reply interface. • Write into RRD. • LS registration. • Installation scripts. • Test configuration files available.
Lookup Service Features • Centralized LS (Creating a distributed LS is ongoing development) • Service Registration (including updates) functionality • Service deregistration functionality • Lookup/query functionality (XQuery/XPath) • Services keep-alives • including database cleanup, scheduled functionality • Registration component for a service available. • Installation scripts.
PerfSONAR Next steps • Formal partnership • License, Partnership Agreement • Interim solution • Upgrade existing user base (currently using prototype) • Data exchange policy (measurement peering agreement) • Consistent offer of services. • What services package to suggest to networks. • L2 status monitoring.
ESnet Joe Metzger
Last months (Jan – May) - 1 • Services • Lookup Service • Centralized • Registration / deregistration • Lookup query • Result code • SQL MA • Stores data in relational database • Supports • Utilization • L2 status • Result code. • HADES MA • Provides access to the data archive of Hades measurements from GEANT2 network
Last months - 2 • Tools integration • Telnet / SSH MP • On-demand requests for device specific information • Cisco/Juniper/Quagga support • Resource protection mechanisms • To verify the parameters send in the commands • To prevent flood of requests • BWCTL / OWAMP MP • BWCTL • TCP throughput measurements • OWAMP • OWD, PL measurements
Last months - 3 • Tools integration • Passive • ABW • Counts the number of captured packets and bytes and computes used bandwidth • Short timescale intervals • Tracefile Capture Measurement Point (TCMP) • Used for capturing packets of selected flows using either regular Eth cards or special DAG or COMBO6 cards • SNMP MP • Web Service access to the usage of SNMP • Get for now and OID discovery
Last months - 4 • Alcatel NMS MP • Web Service access to SDH and WDM monitoring parameters such as SES, ES, UAS and also the G.709 metric BBE • Acts as a reference implementation • Can be used by other NRENs in order to build perfSONAR compliant services which can retrieve data from NMS • AA • Designing and developing a perfSONAR AA service making use of JRA5's eduGAIN • Topology Service • Common schema with SA3
What You See Is What You Get • perfsonarUI • Retrieval of published data • RRD MA • Hades MA • Visualisation of OWD, IPDV and packet loss between Hades MP • Parsing of arbitrary IPv4 or IPv6 traceroute commands • CNM – map based • GEANT2 + NRENS maps • VisualperfSONAR • Looking Glass
Powerful tools and useful information Design (MA’s, MP’s… approach is good) The number of deployed services is high Friendly user interfaces Tools bring a motivation for installing services for attendees Sharing of info between projects is useful Need to integrate the tools in a single visualisation application. There are too few networks nodes running the services Not enough data available Not enough information available about perfSONAR Would like to have libraties/APIs Requirement for having its network perfSONAR enabled. Meet the NRENs sessions
Next disseminations workshops • SEEREN2 workshop in Heraklion. • E2E service status services deployment for NRENs next week in Muenchen. • Three more installation workshop planned over the next 12 months.
NREN in charge of retrieving the data from the NMS/DB to analyse them and pass the information to a java class. About 700-1000 lines of code for GÉANT – 15 days. JRA1 Provides the “mySQL MA service” code maintains it. Provides the script to write into the DB JRA4 in charge of the E2E NOC visualisation. Data Exchange for E2E Monitoring – Archive scenario Connect. Communicate. Collaborate
Year 3 Objectives • Improving the visualisation and tools features (NOC, PERT, project) • Integration of AA. • Services deployments. • Going operational (with SA3 WI15). • Mastering the amount of data. • L1-L2 • Dissemination workshops for NRENs.
Connect. Communicate. Collaborate Timeline
Phases III • End of November 2006 • Going operational (SA3 WI-15) : • RRD MA • LS (plus LS registration for the other services) • SNMP MP • perfsonarUI • CNM • Hades and RIPE TTM MA • BWCTL MP. • L2 status MP. • Novelties • Netflow Integration • Topology Service • VisualperfSONAR • Multi-LS • Push interface
Phase IV • End of May 2007 • Going operational (SA3 WI-15) : • Multi-LS • Topology Service • Hades MP • BWCTL MA • VisualperfSONAR • Novelties • First set of services using JRA5 Authentication with some Authorization. • Performance anomaly detection
Phase V • End of November 2007 • Going operational (SA3 WI-15) : • Authentication Service
Internet2 Eric Boyd
Vision: Performance Information is … • Available • People can find it (Discovery) • “Community of trust” allows access across administrative domain boundaries (AA) • Ubiquitous • Widely deployed (Paths of interest covered) • Reliable (Consistently configured correctly) • Valuable • Actionable (Analysis suggests course of action) • Automatable (Applications act on data)
Getting There: Build & Empower the Community Decouple the Problem Space: • Analysis and Visualization • Performance Data Sharing • Performance Data Generation Grow the Footprint: • Clean APIs and protocols between each layer • Widespread deployment of measurement infrastructure • Widespread deployment of common performance measurement tools
Result: No more mystery … • Increase network awareness • Set user expectations accurately • Reduce diagnostic costs • Performance problems noticed early • Performance problems addressed efficiently • Network engineers can see & act outside their turf • Transform application design • Incorporate network intuition into application behavior
Immediate Game-plan: • Internet2 is leveraged to help provide diagnostic information for “backbone” portion of problem • Create *some* diagnostic tools • Make Abilene data as public as is reasonable • Work on efforts to more widely make performance data available (perfSONAR) • Contribute to ‘base’ development • Integrate ‘our’ diagnostic tools as ‘good’ example MP/MA services
BWCTL (Bandwidth Controller) • What is it? A resource allocation and scheduling daemon for arbitration of iperf tests • Typical Solution • Run “iperf” or similar tool on two endpoints and hosts on intermediate paths • Typical road blocks • Need permissions on all systems involved • Need to coordinate testing with others • Need to run software on both sides with specified test parameters
OWAMP: One-Way Active Measurement Protocol • What is it? • Measures one-way latency: 1-way ping • Control connection used to broker test request based upon policy restrictions and available resources. (Bandwidth/disk limits) • Specification • http://tools.ietf.org/wg/ippm/draft-ietf-ippm-owdp/draft-ietf-ippm-owdp-14.txt
Thrulay Overview • Network capacity and delay tester • Same class of tools as iperf, netperf, nettest, nuttcp, ttcp, etc. • Unique features not found in other tools: • TCP: measures round-trip delay along with goodput • UDP: measures: • One-way delay, with quantiles • Packet loss • Packet duplication • Reordering • UDP: ability to send precisely positioned true Poisson streams (microsecond errors in sending times) • Human and machine-readable (ready to be fed to gnuplot)
Thrulay Update • New release v0.8 • Tests with multiple TCP streams • Set DSCP (a.k.a. first 6 bits of the TOS byte) • Report MTU and/or MSS (whichever the OS makes available) • More UDP statistics: duplication, reordering, quantiles of delay • SPARC/Solaris support • Mac OS X support • IPv6 support • Non-busy-waiting UDP mode (less precise, but can run more concurrent tests) • Documentation: manual pages have been added • Basic client authorization based on IP address • Integration of TSC timekeeping projects for faster and more precise timestamping
NDT: Network Diagnostic Tool • Web100 enhanced server handles testing and diagnostic services • Java based and command line clients allows testing from any client (local or remote) • Performance and configuration faults reported back to client • Drill-down functions provide more details & error reporting capabilities • Grant from NIH/NLM to explore duplex mismatch detection
Well Known NDT Server Web Request NDT - Server Client Redirect msg Web Browser Web Server Web Page Request Web page response Testing Engine Java Applet Test Request Control Channel Spawn child Child Test Engine Specific test channels NDT Flow Diagram
Bulk Transport • Build a library / tool for bulk transport that does not require kernel level modifications yet achieves the performance of such • VFER library • Congestion control hooks • Implements loss-based congestion control • Working on delay-based version • File transfer utility • An initial version demoed
Everything we work on is available • Tools are open source, supported, well-documented • BWCTL/Iperf, OWAMP, NDT are deployed across Abilene backbone and at many partners • You can: • See ongoing measurement results at the Abilene Observatory • Test to/from the Abilene backbone
Network Performance Measurement Workshops • Example Course Materials: • http://e2epi.internet2.edu/npw/presentations.html Goals: • Grow installed base of BWCTL/Iperf, OWAMP, and NDT at GigaPoP and regional campuses. • http://e2epi.internet2.edu/pipes/pmp/pmp-dir.html • Begin integration into IT support processes. • Create an installed base for perfSONAR deployment. • Give each participant tool-specific cookbooks.
Completed SOX / GaTech (03/05) CENIC / UCLA (06/05) JT – Vancouver (07/05) OARNet / OSU (09/05) MAGPI / FMM (09/05) MAX / College Park (12/05) APAN (01/06) JT - Albuquerque (02/06) MERIT (02/06) Columbia / NYSERNet (04/06) University of Virginia (04/06) Planned Wisconsin (07/06) Under Consideration Alaska, … Network Performance MeasurementWorkshop Locations and Dates