190 likes | 343 Views
Monitoring Appliance Status. Otto Kreiter, DANTE LHCOPN, Geneva, 17.06.2008. Agenda. Monitoring parameters & monitoring scheduling (proposal) Deployment process for the LHCOPN perfSONAR MDM v3.0 & E2EMON features. Monitoring parameters & Monitoring scheduling (proposal). Box-1. LHCOPN.
E N D
Monitoring Appliance Status Otto Kreiter, DANTE LHCOPN, Geneva, 17.06.2008
Agenda • Monitoring parameters & monitoring scheduling (proposal) • Deployment process for the LHCOPN • perfSONAR MDM v3.0 & E2EMON features
Monitoring parameters & Monitoring scheduling (proposal)
Box-1 LHCOPN Box-2 Box-3 Typical T0-T1 measurement appliance installation reminder HADES Agent (delay) Pinger/Tracert-MP Cacti, RRD-MA (count) SQL MA(L2 stat) E2ESync Lookup service BWCTL tool – MP(TCP) perfSONARBuoy Auth. Service
Metrics collected & scheduling • Delay - scheduled measurements • Owe-Way Delay (OWD), IP Packet Delay Variation (IPDV), One Way Packet Loss (OWPL) • Every 60s • Achievable Bandwidth - scheduled measurements • TCP throughput transfer (max. 950Mbit/s) • Every 6h • Achievable Bandwidth - on-demand measurements • TCP and UDP throughput transfer (max. 950MBit/s) • Traceroute - scheduled measurements • Hop list • Every 5 min
Metrics collected & scheduling cont. • Router Interface statistics • Link capacity • Link utilisation • Interface input errors • Interface output drops • Every 5 min • L2 circuit status • domain and/or inter-domain circuit status (UP/DOWN) • Every 5 min
Delay - scheduled measurements Enable • to identify routing issues (OWD) • to identify congestion (w/ or w/o packet loss) (OWD, IPDV, OWPL) • to identify high packet loss rate (OWPL) • to identify path instability (IPDV) • to keep a historical trace of the changes at a finer grained granularity (60s) • to validate the recovery of the service after failure • to verify consistency before and after maintenance
Achievable Bandwidth -scheduled measurements Enable • to identify performance degradation • to compare data transfer rate against historical baseline from a well tuned host • to validate the recovery of the service after failure
Achievable Bandwidth on-demand measurements Enable • to troubleshoot TCP throughput performances by running tests from a well tuned host and compare them against a historical information • to validate the recovery of the service after failure • to verify consistency of the performances before and after maintenance
Traceroute - scheduled measurements Enable • to identify IP path stability over time • to identify routing issues • to validate the path recovery of the service after failure • to verify the path consistency before and after maintenance
Router Interface statistics Enable • to identify traffic load for troubleshooting (congestion, heavy utilisation), long term trend and planning • to estimate the available bandwidth • to identify short term and long term congestion with losses (output drops) • to identify faulty links (input errors) • to verify traffic recovery after failure • to verify consistency before and after maintenance
L2 circuit status Enable • to identify the status of a circuit segment in a given domain.
Conclusion • Parameters and scheduling wel estalished but not nailed. • Scheduling can be customized as per customer needs. • Next step to demonstrate UI – next LHCOPN
Deployment process for the LHCOPN
perfSONAR MDM LHCOPN Site Deployment Site Deployment Steps: • Site survey • Hardware purchase • Shipment of the hardware to site • Hardware installation • Software installation • Service configuration
perfSONAR MDMLHCOPN deployment steps • Deployment planned in phases • Each phase involves 4 sites • Each site will: • Be addressed individually • Have a dedicated Service Desk person • Successful deployment depends upon site collaboration
perfSONAR MDM 3.0 • Significant improvements to software installation • rpm and Debian packages • New 'Web Admin' interface • slick web based configuration interface for ease of configuration, administration and support of software. • Authentication and Directory services (Lookup) • validation of identities specified by eduGAIN • identity from various Identity Providers • The bundle contains 10 different web service software. http://wiki.perfsonar.net/jra1-wiki/index.php/PerfSONAR_v3.0
E2EMON enhancement • SNMP trap for Nagios sensors / per project and per link to selected recipients (requested by PIC and IN2P3) • Feature available mid July.