1 / 25

GridView - A Monitoring & Visualization tool for LCG

GridView - A Monitoring & Visualization tool for LCG. Rajesh Kalmady, Phool Chand, Kislay Bhatt, D. D. Sonvane, Kumar Vaibhav B.A.R.C. BARC-CERN/LCG Meeting 15.09.2006. Gridview : New Developments (During 27 th April to 15 th September).

dinos
Download Presentation

GridView - A Monitoring & Visualization tool for LCG

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. GridView - A Monitoring & Visualization tool for LCG Rajesh Kalmady, Phool Chand, Kislay Bhatt, D. D. Sonvane, Kumar Vaibhav B.A.R.C. BARC-CERN/LCG Meeting 15.09.2006

  2. Gridview : New Developments(During 27th April to 15th September) • Enhancements to Gridftp file transfer monitoring • Development of summarization and presentation modules for • Job Monitoring • Service Availability Monitoring • Deployment of all the new developments to production system

  3. File Transfer Monitoring • Enhanced Gridftp summarization and presentation modules for • VO-wise distribution of overall data transfers • VO-wise distribution of data transfers per Site • Site-wise distribution of data transfers per VO • Developed graphs and reports for data transfers from all sites to a given site (Hourly, Daily reports)

  4. File Transfer Monitoring : Overall VO-wise Details

  5. File Transfer Monitoring : Site-wise details for a particular VO

  6. Job Monitoring • Developed summarization module for computation of job statistics • Developed presentation module to display periodic Graphs and Reports for • Job Status (Total Number of Jobs in various States) • Job Success Rate • Job Resource Utilization (Elapsed time,CPU, Memory) • Average Job Turnaround time (RB Waiting, Site Waiting, Execution Time) • Site, VO and RB-wise distribution • Hourly, Daily, Weekly and Monthly reports

  7. Job Monitoring (Cont…) • Developed periodic Graphs and Reports for • Overall Summary • sites with high/low job execution rate • sites with high/low job success rate • VOs running more/less jobs etc • Possible to view job statistics for any user selected combination of VO, Site and RB

  8. Job Status : State-wise Distribution

  9. Job Status : VO-wise Distribution

  10. Job Status : RB-wise Distribution

  11. Job Status : Site-wise Distribution

  12. Job Monitoring : Job Success Rate

  13. Job Monitoring : Average Job Turnaround time

  14. Service Availability Monitoring • Developed summarization module for computation of Service Availability • based on SAM Test Results • AND (critical services) of OR (redundant services) • Developed presentation module to display periodic Graphs and Reports for • Central Service Availability (FTS, LFC, RB) • Aggregate tier-1 site Availability • Site-wise availability for individual tier-1 sites • Site-wise service availability of tier-2 sites (grouped by associated VOs) • Detailed availability of various services (CE, SE, SRM) and their individual instances running at a particular site

  15. Service Availability Monitoring (Cont…) • Reports on Hourly, Daily, Weekly and Monthly basis • Tracability from Aggregate Availability to Individual Service Instance Availability • Provision for saving user preferences based on certificates

  16. Service Availability Monitoring : Central Service Availability

  17. Service Availability Monitoring : FTS Instance Availability

  18. Service Availability Monitoring : Aggregate T1 Site Availability

  19. Service Availability Monitoring : Tier-1 Site Availability

  20. Service Availability Monitoring : Site Detail Availability

  21. On-going Work • Presentation of Detailed SAM test results for traceability from Availability Graphs to corresponding tests • Development of Weekly and Monthly reports for All to Given site data transfers • Modification to Gridftp file transfer GUI and Reports in order to enable Multiple site selection (new request)

  22. Future Work • Visualization of FTS Statistics • Archival of Job data for jobs submitted directly to CE • Interfacing GridView with Information System (Top level BDII) for Resource Availability • Compute nodes (WNs), Storage etc

  23. Future Work : Visualization of FTS Statistics • Currently GridView visualizes gridftp data transfer rates across the sites. • FTS statistics to be visualized include • Successful transfers • Failure rates • VO-wise, FTS server-wise and Channel-wise details of data transfers

  24. Problems • No data is being published to R-GMA table JobMonitor since 2 months (in spite of repeated reminders) • Gridview Availability Depends on • R-GMA Service • Oracle Database Service • SAM/SFT tests • Instabilities in Gridview service caused by • R-GMA Instabilities • Registry failures, Monbox failures, Data loss etc. • Occasional Oracle downtime • Unannounced software upgrades on production machines leading to broken code • Subsequently, Gridview address added to cern-quattor-announce mailing list and upgrades done manually by Gridview team

  25. Thank You Your comments and suggestions please

More Related