1 / 35

The Grid Observatory

The Grid Observatory. Operated by L aboratoire de Recherche en Informatique Laboratoire de l’ Accélérateur Linéaire Imperial College London. With the support of France Grilles – French NGI of EGI EGI-Inspire Ile de France council (Software and Complex Systems programme )

gaia
Download Presentation

The Grid Observatory

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Grid Observatory

  2. Operated by • Laboratoire de Recherche en Informatique • Laboratoire de l’ AccélérateurLinéaire • Imperial College London

  3. With the support of • France Grilles – French NGI of EGI • EGI-Inspire • Ile de France council (Software and Complex Systems programme) • INRIA – Saclay (ADT programme) • CNRS (PEPS programme) • University Paris Sud (MRM programme)

  4. Digital Curation

  5. for the EGIbehavioral data

  6. Production since October 2008 CCGrid 2011 6

  7. Traces available through the portal CCGrid 2011 7

  8. No grid certificate required CCGrid 2011 8

  9. Try at our booth!

  10. Try at our booth!

  11. Torque CE Logging& bookkepping BDII IC RTM WMS SQL LDAP HTTP SFTP Incoming Anonymisation Upload Grid ObservatoryPortal DPM via HTTPs Storage Elements On top of EGI monitoring - anonymized

  12. The GO users

  13. Lessons learned Sociology • Running a production system for usage by computer science and engineeringis nearly unchartered territory – we are a few explorators • Verified that 80% of the cost of Data Mining is in pre-processing

  14. Lessons learned Technique • Build on existing monitoring tools • No fancy technology: the goal is usage, not the tool

  15. The Green Computing Observatory

  16. The first barrier to improvedenergyefficiencyis the difficulty of collecting data on the energy usage of individualcomponents, and the lack of overall data collection

  17. The GCO monitors energy usage ata large computing center, and publishesthemthroughthe GridObservatory.

  18. A second barrierismaking the collecteddata usable, consistent and complete. GCO adopts an ontologicalapproachin order to rigorouslydefinethe semantics of the data and the context of their production.

  19. The GRIF-LAL computing room The LAL Computing Room 240 machines, 2200+ cores, 500TB of storage. Mainly a Tier 2 in the EGI grid, but alsoincludes local services and the StratusLab Cloud testbed Accessible approximation of a data center

  20. Sensors 1 minute samplingperiod

  21. Source: http://www.netways.de/uploads/media/Werner_Fischer_-The-Power-Of-IPMI.pdf

  22. The visualizer

  23. Extension of DOLCE

  24. Models, Policies, Autonomics

  25. A very complex system

  26. Complex systems description

  27. Statistical and Learning models

  28. Optimization and Autonomics

  29. Stationarity?

  30. Stationarity? Then heavy tails

  31. The physical process is not stationary

  32. The physical process is not stationary

  33. Dealing non-stationarity • Adaptive clustering with application to fault diagnosisToward Autonomic Grids: Analyzing the Job Flow with Affinity Streaming. SIGKDD'2009 • MDL segmentation applied to workloadDiscovering Piecewise Linear Models of Grid Workload.CCGRID 2010

  34. Intelligibility How to build knowledge? • Supervised learning? No reference, too rare experts • Let’s build it on-line! Model-free policies e.g. Reinforcement Learning! • Unfortunately, tabula rasa policies and vanilla ML methods are too often defeated [Rish & Tesauro 2006). Exploration/exploitation tradeoff

  35. Intelligibility • FaultmodelsDistributed Monitoring with Collaborative Prediction. CCGRID 2012 • Cloud managementCharacterizingE-Science File Access Behavior via Latent Dirichlet Allocation.UCC 2011

More Related