1 / 10

LHCOPN Update

September 2010 GDB. LHCOPN Update. John Shade /CERN IT-CS. Working Groups. LHCOPN Operations and Monitoring WGs F2F LHCOPN meetings London 8/9 March ( http:// indico.cern.ch/conferenceDisplay.py?confId =80755 )

portia
Download Presentation

LHCOPN Update

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. September 2010 GDB LHCOPN Update John Shade /CERN IT-CS

  2. Working Groups • LHCOPN Operations and Monitoring WGs • F2F LHCOPN meetings • London 8/9 March (http://indico.cern.ch/conferenceDisplay.py?confId=80755) • Barcelona 28/29 June (http://indico.cern.ch/conferenceDisplay.py?confId=88698) • Operations WG • Quarterly phone conferences • Track correlation between outages and GGUS tickets • Monitoring WG • Conference calls in May/June, numerous e-mail exchanges • perfSONAR MDM setup & deployment • LHCOPN Dashboard design • Mailing list: LHCOPN-Interest@cern.ch J. Shade/GDB LHCOPN Update

  3. Monitoring • Working with DANTE to get a robust MDM solution in place (perfSONAR rollout had stalled) • Clarified how to access performance data, and defined requirements for a dashboard for visualisation: • See https://twiki.cern.ch/twiki/bin/view/LHCOPN/MonWG • Comments on Requirements document are still welcome! J. Shade/GDB LHCOPN Update

  4. Monitoring • Missing a central view of LHCOPN • Weathermapand e2emon applications restricted to GEANT portal • HADES data: • Bandwidth Test Control / Achievable Bandwidth (BWCTL, automated 1Gbit/s TCP Bandwidth Control Test) • One Way Delay (OWD) • One Way Delay Variance / Jitter (OWDV) • Packet loss • Traceroute(number of hops between two Hades nodes) • Duplicate packets • Out of order packets J. Shade/GDB LHCOPN Update

  5. Initial (simple) algorithm • Site status is up when OWD between +/-15% from baseline and packet loss less than 0.1% per five minutes • Site status is down when packet loss = 100% per five minutes • Site status is degraded when measurement values are between a) and b). J. Shade/GDB LHCOPN Update

  6. Prototype Dashboard J. Shade/GDB LHCOPN Update

  7. Prototype Dashboard J. Shade/GDB LHCOPN Update

  8. Where do we go from here? • DANTE baulked at the idea of developing their prototype further and supporting it  • SARA and CERN have picked up the gauntlet • SARA developers have tested XML query/responses against the central HADES repository at DFN • TOM team leader is evaluating how best to develop/integrate the LHCOPN dashboard • Sites already have local monitoring, but we need to provide a central view! • Nagios probes for sites are also expected J. Shade/GDB LHCOPN Update

  9. Upcoming Events • Next F2F LHCOPN meeting will take place at CERN on 7th-8th October • Agenda: http://indico.cern.ch/conferenceDisplay.py?confId=102716 • Includes participants from Internet2, DANTE, T1s etc. • Topics to be covered include: • Tier2 Connectivity Requirements • Service Level Definition • GGUS • Monitoring • Operations J. Shade/GDB LHCOPN Update

  10. J. Shade/GDB LHCOPN Update

More Related