110 likes | 258 Views
January 31 2010, perfSONAR-PS Developers Meeting Jason Zurawski, Internet2 Brian Tierney, ESnet. Nagios Integration. Outline. Idea Configuration Static configuration (e.g. Easy stuff) Dynamic configuration (e.g. Hard stuff) GUIs Monitoring other instances
E N D
January 31 2010, perfSONAR-PS Developers Meeting Jason Zurawski, Internet2 Brian Tierney, ESnet Nagios Integration
Outline • Idea • Configuration • Static configuration (e.g. Easy stuff) • Dynamic configuration (e.g. Hard stuff) • GUIs • Monitoring other instances • Visualizing the data on the toolkit
Idea • Nagios will be used to monitor the health of the toolkit • Ensuring processes are running (or not running) • Ensuring data is meeting thresholds • Alerting in the event of a problem • Visualizing stability over time • Do not want to recommend development on 3.1 versions • See suggested dev freeze in ‘LiveCD’ topic • Would be included on 3.2
Configuration • Will be a need to have custom configuration • Each toolkit will choose different setup options, forcing different sets of things to care about • Where to send email • Which processes to monitor • Which data sets to monitor • Different definitions for key thresholds (time based, data based) • Will break this into two types: • ‘static’ is the same for all instances • ‘dynamic’ depends on the environment
Configuration - Static • Process Monitoring • Httpd • Ntpd (running + synced) • Service watcher process (re-starting pS daemons) • Host Monitoring • Disk (too full) • Load (too high) • Process count (too high) • General configuration • Need to know an SMTP host to relay through • Send the email to someone (admin GUI will record this) • Send certain notifications to pS?
Configuration - Dynamic • Process Monitoring • ssh (if enabled) • Measurement daemons (owamp/bwctl/ndt/npad/pSB master and collector) – as enabled • pS Daemons (LS, SNMP, pSB, PingER) – as enabled • Process running + respond to WS requests • Mysqld (if running) • Data sets • Can check in one two ways: • Through the WS • Direct DB Query • Could also do both
Configuration - Dynamic • Data Sets (cont.) • Data above or below a threshold • Errors on an interface • Utilization too high • BWCTL expectation too low • OWAMP loss/jitter too high • Data older than a time period • Will need to see regular testing config for this (e.g older than the data expectation interval) • Data flapping between states • OWAMP/PingER latency • Interface status
Configuration - Dynamic • Host Monitoring • Monitor the health of related machines? • Custom alerts • RAM • Disk • Processing • General configuration • How often to alert?
GUIs • Use existing Nagios GUIs to show a mesh of deployments (or the entire pSPT world) • Other GUIs out there • http://www.debianhelp.co.uk/nagiosweb.htm
Nagios Integration January 31 2010, perfSONAR-PS Developers Meeting Jason Zurawski, Internet2 Brian Tierney, ESnet For more information, visit http://code.google.com/p/perfsonar-ps/wiki/20100131Meet