430 likes | 438 Views
Explore the evolving story of measurement on the Internet2 Network, including the Internet2 Observatory and its capabilities for data collection and research.
E N D
Measurement on the Internet2 Network: an evolving story Matt Zekauskas Joint Techs, Minneapolis 11-Feb-2007
Outline • The Internet2 Observatory • What we are measuring today • The perfSONAR vision • What is happening in the near term • LHC OPN “e2emon”
The Observatory • Collect data for operations • Understanding the network, and how well it is operating • How we started • Collect data for research • Part of Internet2’s long-standing commitment to network research
The Observatory • Two components • Data collected by NOC and Internet2 itself • Ability for researchers to collocate equipment when necessary
The New Internet2 Network • Expanded Layer 1, 2 and 3 Facilities • Includes SONET and Wave equipment • Includes Ethernet Services • Greater IP Services • Requires expanded Observatory
In Brief • Extends to all optical Add/Drop Sites • Add capability: • Run the control software • Other out-of-band mgmt. tasks • Refresh of Observatory • Refresh PCs • 10G capabilities on IPO • 10G capability on Ciena Network(planned, next year) • Experimental NetFPGA Cards (planned, next year) • Standing up each node as it is installed
The New Internet2 Observatory • Seek Input from the Community, both Engineers and Network Researchers • Current thinking is to support three types of services • Measurement (as before) • Collocation (as before) • Experimental Servers to support specific projects - for example, Phoebus (this is new) • Support different types of nodes: • Optical Nodes • Router Nodes
Existing Observatory Capabilities • One way latency, jitter, loss • IPv4 and IPv6 (“owamp”) • Regular TCP/UDP throughput tests – ~1 Gbps • IPv4 and IPv6; On-demand available (“bwctl”) • SNMP • Octets, packets, errors; collected 1/min • Flow data • Addresses anonymized by 0-ing the low order 11 bits • Routing updates • Both IGP and BGP - Measurement device participates in both • Router configuration • Visible Backbone – Collect 1/hr from all routers • Dynamic updates • Syslog; also alarm generation (~nagios); polling via router proxy
Observatory Hardware • Dell 1950 and Dell 2950 servers • Dual Core 3.0 GHz Xeon processors • 2 GB memory • Dual RAID 146 GB disk • Integrated 1 GE copper interfaces • 10 GE interfaces • Hewlett-Packard 10GE switches • 9 servers at router sites, 3 planned at optical only sites (initially 1 - control)
Observatory Databases – Datа Types • Data is collected locally and stored in distributed databases • Databases • Usage Data • Netflow Data • Routing Data • Latency Data • Throughput Data • Router Data • Syslog Data
Lots of Work to be Done • Internet2 Observatory realization inside racks set for initial deployment, including planning for research projects (NetFPGA, Phoebus) • Software and links easily changed • Could add or change hardware depending on costs • Researcher tools, new datasets • Consensus on passive data
New Challenges • Operations and Characterization of new services • Finding problems with stitched together VLANs • Collecting and exporting data from Dynamic Circuit Service... • Ciena performance counters • Control plane setup information • Circuit usage (not utilization, although that is also nice) • Similar for underlying Infinera equipment • And consider inter-domain issues
Observatory Requirements Strawman • Small group: Dan Magorian, Joe Metzger and Internet2 • See document off of http://measurement.internet2.edu/ • Want to start working group under new Network Technical Advisory Committee • Interested? Talk to Matt or watch NTAC Wiki on wiki.internet2.edu; measurement page will also have some information…
Strawman: Potential New Focus Areas • Technology Issues • Is it working? How well? How debug problems? • Economy Issues – interdomain circuits • How are they used? Are they used effectively? Monitor violation of any rules (e.g. for short-term circuits) • Compare with “vanilla” IP services?
Strawman: Potential High-Level Goals • Extend research datasets to new equipment • Circuit “weathermap”; optical proxy • Auditing Circuits • Who requested (at suitable granularity) • What for? (ex: bulk data, streaming media, experiment control) • Why? (add’l bw, required characteristics, application isolation, security)
Inter-Domain Issues Important • New services (various circuits) • New control plane • That must work across domains • Will require some agreement among various providers • Want to allow for diversity…
Sharing Observatory Data We want to make Internet2 Network Observatory Data: • Available: • Access to existing active and passive measurement data • Ability to run new active measurement tests • Interoperable: • Common schema and semantics, shared across other networks • Single format • XML-based discovery of what’s available
What is perfSONAR? • Performance Middleware • perfSONAR is an international consortium in which Internet2 is a founder and leading participant • perfSONAR is a set of protocol standards for interoperability between measurement and monitoring systems • perfSONAR is a set of open source web services that can be mixed-and-matched and extended to create a performance monitoring framework
perfSONAR Design Goals • Standards-based • Modular • Decentralized • Locally controlled • Open Source • Extensible • Applicable to multiple generations of network monitoring systems • Grows “beyond our control” • Customized for individual science disciplines
perfSONAR Integrates • Network measurement tools • Network measurement archives • Discovery • Authentication and authorization • Data manipulation • Resource protection • Topology
perfSONAR is a joint effort: ESnet GÉANT2 JRA1 Internet2 RNP ESnet includes: ESnet/LBL staff Fermilab Internet2 includes: University of Delaware Georgia Tech SLAC Internet2 staff GÉANT2 JRA1 includes: Arnes Belnet Carnet Cesnet CYNet DANTE DFN FCCN GRNet GARR ISTF PSNC Nordunet (Uninett) Renater RedIRIS Surfnet SWITCH perfSONAR Credits
R&E Networks Internet2 ESnet GÉANT2 European NRENs RNP Application Communities LHC GLORIAD Distributed Virtual NOC Roll-out to other application communities in 2007 Distributed Development Individual projects (10 before first release) write components that integrate into the overall framework Individual communities (5 before first release) write their own analysis and visualization software perfSONAR Adoption
Proposed Data to be made available via perfSONAR • First Priorities • Link status (CIENA data) • SNMP data • OWAMP • BWCTL • Second Priorities • Additional CIENA data • Ethernet stats • SONET (Severely errored seconds, etc.) • Light levels • Similar Infinera data • Later: Flow data • Feedback? Alternate priorities?
What will (eventually) consume data? • We intend to create a series of web pages that will display the data • Third-party Analysis/Visualization Tools • European and Brazilian UIs • SLAC-built analysis software • LHC OPN E2EMON • More … • Real applications • Network-aware applications • Consume performance data • React to network conditions • Request dynamic provisioning • Future Example: Phoebus
JRA4 E2EMon slides From Mauro Campanella, GARR, 2006-Nov Demo: http://cnmdev.lrz-muenchen.de/e2e/html/G2_E2E_index.html
Domain B Domain A Domain C Problem space E2ELink A-B Point A PointB Goal: (near) real-time monitoring (link status) of constituent DomainLinks (and links between domains) and whole end-to-end Link A-B. The following applies to the GÉANT2+ service and the cross border fibres.
Connect. Communicate. Collaborate Divide & conquer(JRA4 E2Emon info model) JRA4 view of world: note WDM systems, & static lambdas
Domain B Domain A Domain C perfSONAR MP or MA DomainLink and (partial) ID_Link info perfSONAR MP or MA perfSONAR Measurement Point (MP) or Measurement Archive (MA) E2Emon correlator E2ECU operators “Weathermap” view for users Approach E2ELink A-B Point A PointB
LHC-OPN e2e Monitoring • e2e lightpath from CNAF (Bologna, Italy) to Karlsruhe (Germany) • The logical topology built for the e2e monitoring system abstracts the internal topology of each domain and produces a simpler topology. MI PD KARLSRUHE CNAF SWITCH BO X Manno WDM GARR X X WDM DFN WDM
LHC-OPN e2e Monitoring MI PD KARLSRUHE CNAF SWITCH BO X Manno WDM GARR X X WDM DFN WDM Domain2 Domain5 Domain1 Domain3, Domain4 End Point Demarcation Point DP DP EP ID Link Domain Link Other Domain Links ID Link ID Link
LHC-OPN e2e Monitoring E2E Monitoring System User interdomain aggregation web services Domain 1 MA Domain 2 MA Domain n MA domain aggregation and xml generation script Domain 1 MP Domain 2 MP Domain n MP polls acquisition Network 1 Network n Network 2
CNAF - CERNGARR monitoring flow IL MONITORING GINS (the GARR network monitoring system) checks the status of the logical circuits in the GARR domain and provides the result to the GARR MP. The central e2e measurement system queries each domain and provides the global e2e status. This shows the domain independency, the possibility to easily aggregate the information and its scalability. GARR monitoring domain end point GINS e2e Monitor CNAF GARR GEANT2 X X IP Link XML Data GARR MP E2E MS MPLS LSP IP/L2 Link
GARR monitoring domain MI PD KARLSRUHE CNAF SWITCH BO X Manno WDM GARR X X WDM DFN WDM GINS User GINS e2e Service (status aggregation) lambda lambda lambda MPLS IP E2E Monitoring System check the status of segments
CNAF - CERNE2E MS user interface VISUALIZZAZIONE
CNAF - CERNGARR GINS user interface VISUALIZZAZIONE (Slides from Marco Marletta , Giovanni Cesaroni GARR)
Measurement System Future work - wish list • Define & implement “degraded” link status • Add scheduled maintenance indication • Add more detail to data model • Break down DomainLink into constituent parts?(e.g. OCh trails) • use more info from equipment