1 / 18

General Grid Monitoring Infrastructure (GGMI)

General Grid Monitoring Infrastructure (GGMI). Peter kacsuk and Norbert Podhorszki MTA SZTAKI. General Grid Monitoring Infrastructure (GGMI). PULSE. PROVE. GRM. R-GMA Browser. Grid Status Monitoring Infrastructure GSMI (R-GMA). Grid Appl. Monitoring Infrastructure GAMI.

tessieb
Download Presentation

General Grid Monitoring Infrastructure (GGMI)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. General Grid Monitoring Infrastructure (GGMI) Peter kacsuk and Norbert Podhorszki MTA SZTAKI

  2. General Grid Monitoring Infrastructure (GGMI) PULSE PROVE GRM R-GMA Browser Grid Status Monitoring Infrastructure GSMI (R-GMA) Grid Appl. Monitoring Infrastructure GAMI

  3. Performance comparison of GSMI and GAMI • Performance measurement • Loop, instrumented with GRM • For loop N, generates 2N+2 events(loop begin + loop end+ start + exit) • 1 machine • 3 machines • M1: Producer and ProducerServlet • M2: All the other servlets including ConsumerServlet • M3: Consumer

  4. Performance of GSMI • AppTotal = time between inserting first and last event • Total = time between inserting first event and receiving the last event in Consumer • On 3 machines, N=1000 R-GMA starts loosing events. Only 1848, 1780 from 2002 events received. • For N=10K, test never finishes (one night at least)

  5. Performance of GAMI • AppTotal = time between inserting first and last event • Total = time between inserting first event and receiving the last event in Consumer • No loosing events • Linear scaling

  6. R-GMA vs GAMI on 3 machines R-GMA GAMI GAMI 1 machine vs 3 machines 10.000 !!! 100.000 !!!

  7. GAMI structure Local Host PROVE Site 1 Main MonitorMM Main MonitorMM Site 2 Host 1 Host 2 Host 1 Local MonitorLM Local MonitorLM Local MonitorLM Application Process Application Process Appl. Process Appl. Process

  8. GAMI • To deliver trace data from the application to the user efficiently. • Uses TCP Socket communication • Data in XDR format and could be optimised for TCP transmission • Two sw. hops between application and GRM: local and main monitors • One hw. hop: host of main monitor

  9. Steps of application monitoring • Step 1: user submits a job (gets GID from the broker) • Step 2: user starts PROVE with parameter GID • Step 3: PROVE looks for the execution site (search in R-GMA) • Step 4: PROVE looks for the address of GAMI Main Monitor of the execution site (search in R-GMA) • Step 5: PROVE subscribes for application trace at the GAMI Main Monitor • Step 6: GAMI Main Monitor associates the application job id (GID) with the Unix process ids.

  10. Problems of Application Monitoring • Problem 1: To find the execution site of the application by PROVE • Where is it running? -> machineX.siteY • Problem 2: To find the monitor to be connected • What is the address of GAMI Main Monitor running at siteY? • Problem 3: To find the application by the GAMI Main Monitor • What processes (PIDs) belong to application GID? • Solution: The info needed for solving these problems should be published in R-GMA => integration of R-GMA and GAMI needed

  11. Problems • Problem 1: To find the execution site of the application by PROVE • Where is it running? -> machineX.siteY • Broker  R-GMA (discussion with WP1) • Problem 2: To find the monitor to be connected • What is the address of GAMI Main Monitor running at siteY? • GAMI Main Monitor R-GMA

  12. Problems • Problem 3: To find the application by the GAMI Main Monitor • What processes (PIDs) belong to application GID? • Problem to be solved: 5 levels of job/process ids • GID (generated by the resource broker) • Condor G – ID • GRAM ID • Local job manager ID • Process ID • Discussion with WP1

  13. Temporary solution for the 3rd problem • User defines unique id for the application • Application process publishes this id to the GAMI Local Monitor • PROVE will use this id for collecting trace data

  14. User support tools General Grid Monitoring Infrastructure (GGMI) Grid Status Monitoring Infrastructure GSMI (R-GMA) Grid Appl. Monitoring Infrastructure GAMI PULSE PROVE R-GMA Browser GRM

  15. Tools • Pulse: • Analysis and presentation of Grid performance data • R-GMA browser: • web-based browser for available shemas and producers within the R-GMA • GRM: • Instrumentation library for trace collection • On-line and off-line monitoring of sequential and MPI applications • PROVE: • On-line and off-line visualization of trace for sequential and MPI applications

  16. Documents and reports • User's Manual for the stand-alone GRM/PROVE • GRM/PROVE User's Guide • Versions of GRM • Peformance Monitoring, Analysis and Presentation for Grid Applications • Technical report about GRM within the EU-DataGrid project • http://www.lpds.sztaki.hu/~pnorbert/grm/

  17. Publications • From Cluster Monitoring to Grid Monitoring Based on GRM • EuroPar’2001, Manchester • Application Monitoring in the Grid with GRM and PROVE • Proc. of the International Conference on Computational Science - ICCS 2001, San Francisco • Presentation and Analysis of Grid Performance Data • EuroPar'2003, Klagenfurt • Pulse: A Tool for Presentation and Analysis of Grid Performance Data • MIPRO'2003, Opatija • http://www.lpds.sztaki.hu/~pnorbert/edg/publications/

  18. Summary • Advantages of the concept: • Gives a full Grid monitoring infrastructure including both • Status monitoring • Application monitoring • Supports on-line and off-line mpi application monitoring and visualization • Increases the chance that it will be used by LCG-2 • No special or extra requirement for R-GMA • Integration will be done by SZTAKI • Gives the potential of competing the US solutions • already two prestigious papers at EuroPar’01 and EuroPar’03 • Further potential publication in JOGC

More Related