1 / 22

perfSONAR Multi-Domain Monitoring Service Deployment and Support: The LHC-OPN Use Case

Explore the challenges and solution of deploying and supporting a large-scale Multi-Domain Monitoring Service for the Large Hadron Collider Optical Private Network (LHC-OPN), utilizing perfSONAR. Discover the topology, requirements, benefits, and seamless monitoring capabilities across domains.

harney
Download Presentation

perfSONAR Multi-Domain Monitoring Service Deployment and Support: The LHC-OPN Use Case

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. perfSONAR Multi-Domain Monitoring Service Deployment and Support: The LHC-OPN Use Case Fausto Vetter, Domenico Vicinanza DANTE TNC 2010, Vilnius, 2 June 2010

  2. Agenda • Large Hadron Collider Optical Private Network (LHC-OPN) • Multi-Domain monitoring challenge: • perfSONAR • GÉANT Multi Domain Monitoring Service • GÉANT Service Desk • The LHCOPN case: • Deployment • Support • Monitoring

  3. LHC-OPN • Large Hadron Collider – Optical Private Network (LHC-OPN): • Dedicated network to support LHC experiment • Large amount of data in a grid environment • Network architecture is organized in Tiers • 1 Tier0, 11 Tier1, 140+ Tier2 • Primary users are researchers around different institutes • Requirement: Large amount of data being exchanged • Strategy: Keep traffic segregated from Internet • Solution: Optical Private Network (LHC-OPN) among Tier 0/1s • Challenge: monitoring effectively in a multi-domain environment

  4. LHC-OPN Topology • Dual-star topology • 10 Gb/s links • Cross-border fibers • resiliency • Multi-domain LHC-OPN Topology

  5. Monitoring the LHC-OPN:The requirements • Focus of monitoring: • Network Layer (IP) • Physical Layer (Links) • Regular Active Point-to-Point Measurements • One-Way Delay, One-Way Delay Variation, Achievable Bandwidth, Historical Traceroute Changes • Regular Passive Point-to-Point Measurements • Utilization, Input Errors, Packet Discards • End-to-End link monitoring • Managed service • Unified view of the network status and information across all sites • Homogeneous installations and centralized operations

  6. Monitoring the LHC-OPN:The solution - perfSONAR • The Tool: perfSONAR • GÉANT multi-domain monitoring (MDM) tool • Based on Open Grid Forum Standard Monitoring Protocol • Customized, fully managed and supported for LHCOPN • Objective: • Identify network problems across multiple domains • Correctly, efficiently and quickly • Allowing proactive actions • Strategy: • perform network monitoring actions in different network domains • make the information available thanks to a common protocol • cross-domain monitoring capability • access network performance metrics from across multiple domains

  7. perfSONAR as unifying layer across domains perfSONAR Services Domain 1 Domain 2 perfSONAR Each domain has its own local monitoring Domain 3 Domain 4 perfSONAR UI (visualization) Scripts/API

  8. Monitoring the LHC-OPN:The benefits • Effective monitoring across the several LHC-OPN domains • perfSONAR enables multi-domain monitoring • Problems can be tracked through the participating domains from a single interface • …proactively solving problems across domains • Effective, distributed monitoring can identify problems even before users suffer them • … through a customized web portal • Monitoring portal designed according to LHCOPN needs • Easy to integrate into involved NOCs workflows • Less disruptions and faster recovery • Easy to take and foster collaborative efforts • Fully managed solution: • Low overhead for the Tier0/1 network operators involved • Configuration, Operation and Support carried out by GÉANT SD

  9. perfSONAR at LHC-OPN • 12 sites (1 Tier0, CERN, and 11 Tier1) involved • Several Countries around Europe, Asia and America • Access to network measurements data from multiple network domains • Customized version of perfSONAR MDM service for Tier0/1 sites (main contributor to LHCOPN operations) • Customized visualization tool accessed: • Dedicated web portal • Specific weather maps and further diagnosis tools to visualize measurements results • Monitoring tools, hardware and operating system packed in monitoring boxes, • To be easily deployed at any location • Remotely accessible by the service desk for operations and support

  10. GÉANT MDM Service Designfor LHCOPN • Two servers installed in each site (Tier0 and Tier1) : • Server 1 (HADES): • one way delay, one way delay variation, achievable bandwidth, historical traceroute changes • Server 2 (MDM): • regular passive measurements carried out for collecting interface utilisation, input error and packet discards statistics from the sites network elements • Each site provided: • Gigabit port on the border router • Switch • Time Sources • DNS Servers

  11. perfSONAR MDM in LHC-OPN

  12. The result as displayed by theLHC-OPN Portal

  13. Weather-map E2Emon Link Status

  14. Weather-map E2Emon Link Status

  15. GÉANT Application Service Desk • Deployment carried out by the GÉANT Application Service Desk • Dedicated Staff • Manage the Users Relationship • Responsible for Incident Management • Interact with Problem Management/Product Management to Improve Products • Acts as a Single Point of Contact: • Usage of Products • Deployment of Products • Debugging Issues on Products • Focus on transition and operations of the services delivered

  16. GÉANT MDM Service Transition • Service deployment: two workflows • Server 1: OS and Software installed and configured by a GÉANT partner • Server 2: OS and Software entirely installed and configured remotely • Phase details: • Pre-Shipment: gather information about deployment details • Pre-Shipment Form • Shipment: servers shipment to GÉANT partner and customer • Receive Boxes: customer and configuration partner receives boxes • Preparation: • Pre-Deployment Form • Third party supplier prepares servers • Physical Installation • Deployment: software installation • Configuration: service configuration • Validation

  17. MDM Service Deployment Agenda

  18. perfSONAR services monitoring • Service Monitoring Infrastructure (based on Nagios+Cacti): • Customised set of testing scripts and health checks • 35 Checks per server, covering hardware, software and services • Automatic notification, detailed history • Three layer monitoring: • Hardware layer: CPU, MEM, disk space, network interfaces, TCP/UDP traffic, temperature • Resource layer: login attempts, Tomcat RRT, eXist RTT, MySQL, NTP • Service layer: perfSONAR services availability and performance • Additional tools: • Syslog server (with MySQL support) • security log auditing (with automatic email report tools)

  19. GÉANT MDM Service Operations:the monitoring interfaces

  20. GÉANT MDM Service Operations:incident management procedures • Well defined procedures for Incident Management: Third party supplier involved Incident Management

  21. Conclusions • GÉANT Application Service Desk: • Effective single point of contact in complex deployments • LHC-OPN use case: • great opportunity for service & support infrastructure • Reasons for a successful deployment: • Preparation phase is crucial • Adequate tools for event and incident management • Customer collaboration was the main player on the deployment. • Continuous service improvement • Periodic meetings with involved parties • Quality audits about the deployment

  22. Final Remarks • Thanks to: • perfSONAR community • GÉANT partners • DANTE • perfSONAR development team • CERN and its partners • Thanks for your attention • Any questions and/or comments?

More Related