1 / 13

P erformance and E xception M onitoring workshop

P erformance and E xception M onitoring workshop. "PEM" projects. Focus on NGOP (Fermilab) http://www-isd.fnal.gov/ngop/ PEM (CERN) http://proj-pem.cern.ch/proj-pem/ History and timescales Differences and similarities Requirements Architecture Understanding the system

melosa
Download Presentation

P erformance and E xception M onitoring workshop

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Performance and ExceptionMonitoring workshop

  2. "PEM" projects • Focus on • NGOP (Fermilab) http://www-isd.fnal.gov/ngop/ • PEM (CERN) http://proj-pem.cern.ch/proj-pem/ • History and timescales • Differences and similarities • Requirements • Architecture • Understanding the system • Areas of common interest • No implementation details yet...

  3. Common history • Autumn 1999/Winter 2000 • User requirements, tools survey • February • Meeting at CHEP • March • Workshop and meeting at Fermilab • Spring • HEPiX Braunschweig • NGOP/PEM meeting at CERN

  4. Project timescales PEM May End of analysis phase June Design/protyping Summer Implementation and deployement on selected clusters Time constraint increasing number of PCs NGOP May-August Design September-October Testing Winter 2001 Integration and release Time constraint Run II

  5. User requirements • Mostly the same • Different focus on operators • NGOP - and OPEP2 - include host management tools • PEM allows only automatic recovery actions, alarm severities not built-in • Scope • Both include only partially the implementation of monitoring agents • PEM does not include user interfaces, although it will provide some

  6. Architecture • NGOP • based on exception monitoring • performance monitoring added separately • PEM • based on performance monitoring • exception monitoring added on top • Both foresee • events/measurements that cannot be generated locally • metric hierarchies

  7. Metric hierarchy • PEM glossary • Service • A predefined set of functionalities provided to users on a set of hosts • NGOP equivalent: cluster • Host • A computing equipment with a network interface • Host type • The property of a set of hosts of having a common goal • Metric (simple or composite) • From GQM • NGOP equivalent: system/subsystem/compenent

  8. PEM Framework Diagram

  9. PEM sub-systems Monitoring Client Monitoring Server Measurement DB Configuration DB Correlation Server Alarm GUI Notifier Access Server History Display Report Generator

  10. NGOP sub-systems Sensor DBServer Monitoring Agent Central Server Component Name Server Looping Mon Agent Mon Agent Persis. Config Data Alarm GUI Notifier History Display Report Generator

  11. What to measure Goal Question Metric Provide information about machine configuration What is the hardware configuration Number, type and clock of CPU(s) • What quantities have to be monitored? • Use GQM • Goal to be achieved • How to achieve it • What to measure in order to verify • Example

  12. Discussion (1) Areas of collaboration GQM Configuration Correlation language and engine Interface to repositories Agents to collect information

  13. Open points Do we need different glossaries? Do we need multiple architectures? Role of operators Role of system openness Contacts tim.smith@cern.ch alessandro.miotto@cern.ch Discussion (2)

More Related