1 / 14

Grid services based architectures

Explore the concepts and buzzwords surrounding Grid services for computing grids. Learn about the architecture, job auditing, provenance information services, metadata catalogs, file management, workload management, and more. Discover the importance of scalable solutions and practical systems. Dive into the evolution of metadata cataloging and file management services, and understand the challenges and opportunities in this critical area.

mmeza
Download Presentation

Grid services based architectures

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Grid services based architectures Some buzz words, sorry… • Growing consensus that Grid services is the right concept for building the computing grids; • Recent ARDA work has provoked quite a lot of interest: • In experiments; • SC2 GTA group; • EGEE • Personal opinion - this the right concept that arrived at the right moment: • Experiments need practical systems; • EDG is not capable to provide one; • Need for pragmatic, scalable solution without having to start from scratch.

  2. Tentative ARDA architecture Job Auditing Provenance Information Service 1: 2: Authentication 3: Authorisation User Interface API 6: 4: Accounting Metadata DB Proxy Catalogue 14: Grid 5: Monitoring 13: 12: File 7: Catalogue Workload Management 10: 9: Package Data Manager 8: Management Discuss more these parts in the following 11: Computing 15: Element Job Monitor Storage Element

  3. Metadata catalogue (Bookkeeping database) (1) • LHCb Bookkeeping: • Very flexible schema ; • Storing objects (jobs, qualities, others ?) ; • Available as a services (XML-RPC interface) ; • Basic schema is not efficient for generic queries: • need to build predefined views (nightly ?) ; • views fit the query by the web form, what about generic queries ? • data is not available immediately after production ; Needs further development, thinking, searching …

  4. Metadata catalogue (Bookkeeping database) (2) Possible evolution: keep strong points, work on weaker ones • Introduce hierarchical structure • HEPCAL recommendation for eventual DMC ; • AliEn experience ; • Better study sharing parameters between thejob and file objects ; • Other possible ideas. This is a critical area – worth investigation ! But… (see next slide)

  5. Metadata catalogue (Bookkeeping database) (3) • Man power problems : • For development, but also for maintenance ; • We need the best possible solution : • Evaluate other solutions: • AliEn, DMC eventually ; • Contribute to the DMC development ; • LHCb Bookkeeping as a standard service : • Replaceable if necessary ; • Fair test of other solutions.

  6. Metadata catalogue (Bookkeeping database) (4) • Some work has started in Marseille: • AliEn FileCatalogue installed and populated by the information from the LHCb Bookkeeping ; • Some query efficiencies measurements done ; • The results are not yet conclusive : • Clearly fast if the search in the hierarchy of directories; • Not so fast if more tags are included in the query; • Very poor machine used in CPPM – not fair to compare to the CERN Oracle server. • The work started to provide single interface to both AliEn FileCatalogue and LHCb Bookkeeping ; • How to continue: • CERN group – possibilities to contribute ? • CPPM group will continue to follow this line, but resources are limited ; • Collaboration with other projects is essential;

  7. File Catalogue (Replica database) (1) • The LHCb Bookkeeping was not conceived with the replica management in mind – added later ; • File Catalog needed for many purposes: • Data; • Software distribution; • Temporary files (job logs, stdout, stderr, etc); • Input/Output sandboxes ; • Etc, etc • Absolutely necessary for DC2004; • File Catalog must provide controlled access to its data (private group, user directories) ; In fact we need a full analogue of a distributed file system

  8. File Catalogue (Replica database) (2) • We should look around for possible solutions : • Existing ones (AliEn, RLS) : • Will have Grid services wrapping soon ; • Will comply with the ARDA architecture eventually; • Large development teams behind (RLS, EGEE ?) • This should be coupled with the whole range of the data management tools: • Browsers; • Data transfers, both scheduled and on demand; • I/O API (POOL, user interface). This is a huge enterprise, and we should rely on using one of the available systems

  9. File Catalogue (Replica database) (3) • Suggestion is to start with the deployment of the AliEn FileCatalogue and data management tools: • Partly done; • Pythonify the AliEn API: • This will allow developing GANGA and other applicationplugins; • Should be easy as the C++ API (almost) exist. • Should be interfaced with the DIRAC workload management (see below); • Who ? CPPM group, others are very welcome; • Where ? Install the server at CERN. • Follow the evolution of the File Catalogue Grid services (RLS team will not yield easily !); This is a huge enterprise, we should rely on using one of the available systems

  10. Workload management (1) • The present production service is OK for the simulation production tasks ; • We need more: • Data Reprocessing in production (planned); • User analysis (spurious); • Flexible policies: • Quotas; • Accounting; • Flexible job optimizations (splitting, input prefetching, output merging, etc) ; • Flexible job preparation (UI) tools ; • Various job monitors (web portals, GANGA plugins, report generators, etc); • Job interactivity; • …

  11. Workload management (2) • Possibilities to choose from: • Develop the existing service ; • Use another existing service ; • Start developing the new one. • Suggestion – a mixture of all of choices: • Start developing the new workload management service using existing agents based infrastructure and borrowing some ideas from the AliEn workload management: • Already started actually (V. Garonne); • First prototype expected next week; • Will also try OGSI wrapper for it (Ian Stokes-Rees); • Keep the existing service as jobs provider for the new one.

  12. Workload management architecture GANGA Workload management Production service Optimizer 1 Job Receiver Optimizer 1 Optimizer 1 Command line UI Job DB Site Agent 1 CE 1 Job queue Match Maker Agent 2 CE 2 Agent 3 CE 3

  13. Workload management (3) • Technology: • JDL job description; • Condor Classad library for matchmaking; • MySQL for Job DB and Job Queues; • SOAP (OGSI) external interface; • SOAP and/or Jabber internal interfaces; • Python as development language; • Linux as deployment platform. • Dependencies: • File catalog and Data management tools: • Input/Output sandboxes; • CE • DIRAC CE ; • EDG CE wrapper.

  14. Conclusions • Most experiment dependant services are to be developed within the DIRAC project: • MetaCatalog (Job Metadata Catalog); • Workload management (with experiment specific policies and optimizations); • Can be eventually our contribution to the common pool of services. • Get other services from the emerging Grid services market: • Security/Authentication/Authorization, FileCatalog, DataMgmt, SE, CE, Information,… • Aim at having DC2004 done with the new (ARDA) services based architecture • Should be ready for deployment January 2004

More Related