100 likes | 119 Views
PhEDEx Monitoring Nicolò Magini CERN IT-ES-VOS For the PhEDEx development team Computing and Offline Monitoring Workshop 11/05/2011. Outline. PhEDEx monitoring PhEDEx webpage & datasvc PhEDEx plots PhEDEx shift monitoring PhEDEx latency monitoring PhEDEx agent monitoring
E N D
PhEDEx Monitoring NicolòMagini CERN IT-ES-VOS For the PhEDEx development team Computing and Offline Monitoring Workshop 11/05/2011
Outline PhEDEx monitoring PhEDEx webpage & datasvc PhEDEx plots PhEDEx shift monitoring PhEDEx latency monitoring PhEDEx agent monitoring PhEDEx storage monitoring NOTE: This will be a quick summary, for more details see talk at O&C week https://indico.cern.ch/materialDisplay.py?contribId=9&sessionId=21&materialId=slides&confId=132001 11/5/2011
PhEDEx Datasvc PhEDExDatasvc will be the unique service to access info in PhEDEx DB For PhEDEx web For external monitoring tools For your own PhEDEx monitoring tool! https://cmsweb.cern.ch/phedex/datasvc/doc Main areas of work in 2011 Performance Validation of writable APIs Consistency of arguments and output Adding new APIs as requested/needed SlowFiles, SlowSubscriptions, DataTypeUsage… 11/5/2011
PhEDEx webpage Existing webpage not mantainable Single file of 10000 lines of perl code Next-gen prototype not widely used Unfamiliar, missing functionality Gradually replace pages in old webpage with next-gen modules using datasvc as backend First example: new request panel Upcoming: subscriptions page Eventually with shopping cart Rest over the course of 2011 https://cmsweb.cern.ch/phedex 11/5/2011
PhEDEx monitoring plots Porting to Overview/Plotfairy framework Note: Plotfairy backend support independent from maintenance of Overview page Working to complete by this summer Other options explored e.g. protovis 11/5/2011
PhEDEx shift monitoring First next-gen monitoring panel for shifters available since a few months Others will be added in next months Other specialized monitoring panels already provided in next-gen prototype - but not widely used 11/5/2011
Block latency monitoring Latency monitor schema/agents Debugging/understanding content of current table Will extend schema to record more events e.g. 25%/50%/75%/95% block completion mark In progress, should be on Testbed by end of the month Latency visualisation – in Summer Datasvc API Latency plots To explore: Publish per-file latency stats from FilePump logs 11/5/2011
PhEDEx agent monitoring PHEDEX_4_0_0 includes improvements for site agent health monitoring Information from all site agents is collected by local watchdog agent Watchdog now produces a daily report on agent activity Agent alerts, agent CPU/mem usage, etc. Report content can be customized with site-specific plugin Watchdog report can then be notified to site admins with various methods Could be also collected centrally for shifter monitoring, complementing the Agent Status webpage 11/5/2011
PhEDEx storage monitoring PhEDEx Namespace Framework for efficient interaction with local storage Caching, storage dumps, directories… Currently used by BlockDownloadVerify agent Could also be more widely used by other scripts or local tools e.g. FileDownloadVerify scripts Evaluating also use of Namespace to generate space accounting reports of local storage Including storage areas not in PhEDEx e.g. /store/user 11/5/2011
Summary Datasvc framework for providing information from TMDB. Any operator SQL can (should!) become an API Website take the good from next-gen prototype lesson: richer navigation, filtering and presentation Watchdog agent Report summaries and alerts, as desired by the user Namespace framework Generic, lightweight framework for SE interaction, can be used by sites for all sorts of SE tools 11/5/2011