1 / 26

Monitoring Grid Services

Monitoring Grid Services. Yin Chen s0231189@sms.ed.ac.uk June 2003. Contents. Issues of Monitoring Project Proposal. Issues of Monitoring. What the goals of Grid monitoring What's the characteristics of Grid system What may need to be Monitored

efia
Download Presentation

Monitoring Grid Services

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Monitoring Grid Services Yin Chen s0231189@sms.ed.ac.uk June 2003

  2. Contents • Issues of Monitoring • Project Proposal

  3. Issues of Monitoring • What the goals of Grid monitoring • What's the characteristics of Grid system • What may need to be Monitored • What’s the characteristics of Monitoring Data • Related Work

  4. What the goals of Grid monitoring • The question is • Propagate errors to users/management • Performance monitoring to tune the application use the Grid more efficiently Not how to measure resources But how to deliver information to end-users and system/Grid

  5. What's the characteristics of Grid system • Complex distributed system =>often observe unexpectedly low performance Where is the bottleneck? - application - operating system - disks - network adapters on either the sending or the receiving host - network switches, routers Experience of the Netlogger group - 40% network, 40% application, 20% host problems - application: 50% client, 50% server process problems

  6. What's the characteristics of Grid system (cont..) • Dynamic environment • World-wide distributed environment with - high latency - frequent faults - very heterogeneous resources

  7. What may need to be Monitored • Disk space, speed of processor, network bandwidth, CPU load, memory load, network load, network communication time, number of parallel streams, stripes TCP/IP buffer size, disk access time that includes time to copy data to or from the local hard disk on the server.[2][3] • Some of this information are relative static information while others are run-time dynamic information.

  8. What’s the characteristics of Monitoring Data • Run-time monitoring data goes "Old" quickly • Producer should near the entities. • Rapidly and efficiently transport from producer to consumer. • Information should be explicate, e.g. by timestamps • Updates are frequent • Performance information is often stochastic

  9. Related Work • Monitoring and Discovery Service (MDS) • Grid Monitoring Architecture (GMA) • Relational Grid Monitoring Architecture (R-GMA) • Hawkeye • Globus Heartbeat Monitor (HBM) • Network Weather Service (NWS) • GridRM

  10. MDS Architecture

  11. GMA Architecture

  12. R-GMA Architecture

  13. Hawkeye Architecture

  14. HBM Architecture

  15. NWS Architecture

  16. The Global Layer of GridRM

  17. The Local GridRM Layer

  18. Summary and Conclusion • Varieties of different systems exist for monitoring • Each system has its own strengths and weaknesses • Tend to use standard and open components • GGF advocated architecture GMA

  19. Summary and Conclusion (cont.) • The similarities in architecture • At the lowest level, have a sensor or other program that generates a piece of data. • Some systems allow data to be aggregated from a set of resources • At the resource level, gather together the data from several information collectors into one component • Directory component • Decentralised hierarchy structure, which have higher ability in fault tolerance • Differences in using push or pull mechanism

  20. Project Proposal • Goal • Requirement • Architecture -- Pull Model • Specification • Implementation • Testing • Schedule

  21. Goal • Realisation • Lightweight & Simple design • Reliability & Robustness

  22. Architecture • What is Pull model • The monitor sends requests to the service for information. This implies repeated queries of resource attributes over some time period at a specific frequency • On the other hand in a Pushmodel the service sends out notifications to a subscribed sink.

  23. Benefits of Pull • Less network traffic: collections initiated only from top • Has no time synchronisation problem: collect data from resources at the same time. • The server can determine the size of the file, select the appropriate alternate server, and passively control the bandwidth and storage space. • According to Globus, "push" model "generates a large amount of data and results in constant updates to the MDS. • Standard LDAP databases are not designed to handle frequent updates.

  24. Benefits of Pull (Cont.) • The Pull model is based on distributed intelligence to the asset site - it becomes automated. • Using machine-to-machine communications with connected sensors and autonomic computing the asset does self-diagnostics, self maintain and repair, re-routes energy flows, schedules non-routine maintenance and reports on any out of the ordinary activity that poses a security threat. • IBM calls it autonomic computing where machine to machine communications take place to optimise the performance of computing and network resources.

  25. Problems of Pull • must gathering current measurements from all resources. • if the data volume is large in real-time may cause bottleneck problem. • may be not useful in fault detection -- heartbeat events are valid only for a short time interval and should be delivered in this time constraint. • may be not useful in dynamic sensor management. • The push model is the most efficient in terms of bandwidth as requests are not sent, just responses from the service.

  26. Monitoring Grid Services Thanks

More Related