1 / 21

CrossGrid Task 3.3 Grid Monitoring

CrossGrid Task 3.3 Grid Monitoring. Trinity College Dublin (TCD, AC14 – CR11) Brian Coghlan, Stuart Kenny, David O’Callaghan CYFRONET Academic Computer Center, Krakow (CYFRO, C01) Bartosz Balis, Slawomir Zielinski, Kazimieriz Balos ICM, University of Warsaw (ICM, AC2 – C01)

krystale
Download Presentation

CrossGrid Task 3.3 Grid Monitoring

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CrossGrid Task 3.3 Grid Monitoring Trinity College Dublin (TCD, AC14 – CR11) Brian Coghlan, Stuart Kenny, David O’Callaghan CYFRONET Academic Computer Center, Krakow (CYFRO, C01) Bartosz Balis, Slawomir Zielinski, Kazimieriz Balos ICM, University of Warsaw (ICM, AC2 – C01) Krysztof Nawrocki, Adam Padee

  2. CrossGrid Task 3.3 Grid Monitoring Provides monitoring information from four main sources: - Applications (OCM – G) - gathers performance data from an executing application - used by application developers in order to understand an application’s behavior and improve its performance. - Infrastructure (JIMS) - gather and expose information concerning the state of devices used to build a grid environment - notify the user not only about simple events, but derived ones as well, - take managerial actions in cases of failures. - Instruments(/Networks) (SANTA-G) - allow information captured by external monitoring instruments to be introduced into the Grid information system. - used in validation and calibration of both intrusive monitoring systems and systemic models, and also for performance analysis.

  3. CrossGrid Task 3.3 Grid Monitoring - Derived Results - gathering information from other monitoring tools and creation of one consistent user interface. - generation of forecasts of future grid state using Kalman Filters and neural networks.

  4. Grid Monitoring System Infrastructure monitoring R-GMA/OGSA info Application monitoring

  5. Task 3.3.1 OCM-G, Current State • OCM-G integrated with GT. • Secure communication based on globus_io between components (authentication, possibly encryption). • Service Managersrun on a "well known” port(3331, configurable). • Configuration via local config files (user home dir or /opt/cg/etc) • No longer need for shared fs ! • Still one central Service Manager • can handle multi-site applications unless firewalls block communication • Registration of application processes improved • Locks to get rid of race condition while forking LMs • Support for user-defined events (probes) added. • CVS status: • code up to date. • building with autobuild, on RH6.2. • need to make changes to comply with developers guide.

  6. Task 3.3.1 OCM-G, Task Contacts • Task 2.4 - G-PM fully integrated with OCM-G in its current functionality. • G-PM now needs user certificate to connect to the OCM-G.

  7. Task 3.3.1 OCM-G, Integration • Smooth integration with G-PM. • Communication based on globus_io. • No dependenciesto other Globus/EDG components.

  8. Task 3.3.1 OCM-G, Problems and Issues • Building under RH7.3 – problems with globus_io developmentpackage. • Interface to Grid Benchmarks should be defined.

  9. Task 3.3.2 SANTA-G, Current State • Improve the schema of information available: - Done, still more to do • Add more SQL parsing support: - Done, added more WHERE predicates - Supports =, > , < queries • Add on-line data acquistion: - Sensor now starts/stops TCPdump at startup/shutdown - Allows querying of dynamically generated network traffic • Integrate Sensor and QueryEngine components: - Sensor now contacts QueryEngine at startup - informs it when a new log file is generated, informs QE of shutdown • Enhance Viewer functionality - Improved Viewer GUI. - Graphical packet display, displays timestamps in correct format, automatically resolves IP addresses… - Query Builder added to allow user to construct complex queries

  10. Task 3.3.2 SANTA-G, Task Contacts • EDG WP3 - SANTA-G makes use of the EDG R-GMA. - has also contributed to it, CanonicalProducer was an extension to the EDG R-GMA developed as part of Task 3.3.2. • Task 3.3.3 JIMS - integration with this task has begun - work should be completed by the end of the summer (see next slide).

  11. JMX Client JMX Request SANTA-G (R-GMA Producer) R-GMA SQL R-GMA Consumer API JMX ResultSet Task 3.3.2 SANTA-G, Integration

  12. R-GMA Producer Code R-GMA Producer API JIMS (MBean Server) JMX Request R-GMA SQL JMX ResultSet Task 3.3.2 SANTA-G, Integration

  13. Task 3.3.2 SANTA-G, Problems and Issues • Need the most recent EDG R-GMA RPMS - Canonical Producer not in earlier release!! • R-GMA RPMs Redhat 7.3 only! • Still To Do: - Expand schema of available information - Improve SQL support - Complete SANTA-G/JMX integration - Testing - Investigate security

  14. Task 3.3.3 JIMS, Current State • JIRO-based Infrastructure Monitoring System – JIMS - ported from JDMK to pure JMX reference implementation • host monitoring module, ready. • SNMP is in progress • SOAP Gateway for integration with other CG tasks • exposes Web Services based interface • makes integration with OGSA (Open Grid Services Architecture) easier: • Web Services Gateway module • simple SOAP client for testing purposes

  15. JIMS, SOAP Gateway architecture

  16. JIMS, SOAP Gateway Facilities • Web Services Gateway serves as a mediator between MBean Servers in monitored stations and external applications • Place for registering active monitored stations and removing non-existent ones

  17. JIMS, Test SOAP Client Interface

  18. Task 3.3.3 JIMS, Problems and Issues What is done: • Host monitoring system - JIMS - ready • SOAP Gateway - before deadline • Open (not commercial) implementation of discovery services - before deadline To do: • integration with CVS and autobuild process, by the end of this week • Simplifying installation process • Adding functionality: • other mechanisms for monitored stations unregistering • security when connecting modules via Web Services (SOAP/XML)

  19. Task 3.3.4 PostProcessing, Current State • Forecaster based on linear Kalman filter implemented and available as RPM. • More work needed to put it in CVS, will be done during the meeting. • Current solution for real monitoring data from clusters is VO-Centric Ganglia

  20. Task 3.3.4 PostProcessing, Integration • For integration meeting will provide 2 RPM’s: • ganglia-monitor-core-mcastmin-2.4.1-1.i386.rpm • serves as monitoring daemon on worker nodes • gmmetad-2.2-1.i386.rpm • located on cluster CE. Gathers information from monitoring daemons and passes it to central monitoring host • RPM slightly altered wrt to original • Would like to install these on X# clusters for testing during integration meeting.

  21. Task 3.3.4 PostProcessing, Problems and Issues • 3rd part which binds forecaster and data sources under development • Not ready for integration meeting.

More Related