Grid services based architectures

Grid services based architectures Some buzz words, sorry… • Growing consensus that Grid services is the right concept for building the computing grids; • Recent ARDA work has provoked quite a lot of interest: • In experiments; • SC2 GTA group; • EGEE • Personal opinion - this the right concept that arrived at the right moment: • Experiments need practical systems; • EDG is not capable to provide one; • Need for pragmatic, scalable solution without having to start from scratch.

Tentative ARDA architecture Job Auditing Provenance Information Service 1: 2: Authentication 3: Authorisation User Interface API 6: 4: Accounting Metadata DB Proxy Catalogue 14: Grid 5: Monitoring 13: 12: File 7: Catalogue Workload Management 10: 9: Package Data Manager 8: Management Discuss more these parts in the following 11: Computing 15: Element Job Monitor Storage Element

Metadata catalogue (Bookkeeping database) (1) • LHCb Bookkeeping: • Very flexible schema ; • Storing objects (jobs, qualities, others ?) ; • Available as a services (XML-RPC interface) ; • Basic schema is not efficient for generic queries: • need to build predefined views (nightly ?) ; • views fit the query by the web form, what about generic queries ? • data is not available immediately after production ; Needs further development, thinking, searching …

Metadata catalogue (Bookkeeping database) (2) Possible evolution: keep strong points, work on weaker ones • Introduce hierarchical structure • HEPCAL recommendation for eventual DMC ; • AliEn experience ; • Better study sharing parameters between thejob and file objects ; • Other possible ideas. This is a critical area – worth investigation ! But… (see next slide)

Metadata catalogue (Bookkeeping database) (3) • Man power problems : • For development, but also for maintenance ; • We need the best possible solution : • Evaluate other solutions: • AliEn, DMC eventually ; • Contribute to the DMC development ; • LHCb Bookkeeping as a standard service : • Replaceable if necessary ; • Fair test of other solutions.

Metadata catalogue (Bookkeeping database) (4) • Some work has started in Marseille: • AliEn FileCatalogue installed and populated by the information from the LHCb Bookkeeping ; • Some query efficiencies measurements done ; • The results are not yet conclusive : • Clearly fast if the search in the hierarchy of directories; • Not so fast if more tags are included in the query; • Very poor machine used in CPPM – not fair to compare to the CERN Oracle server. • The work started to provide single interface to both AliEn FileCatalogue and LHCb Bookkeeping ; • How to continue: • CERN group – possibilities to contribute ? • CPPM group will continue to follow this line, but resources are limited ; • Collaboration with other projects is essential;

File Catalogue (Replica database) (1) • The LHCb Bookkeeping was not conceived with the replica management in mind – added later ; • File Catalog needed for many purposes: • Data; • Software distribution; • Temporary files (job logs, stdout, stderr, etc); • Input/Output sandboxes ; • Etc, etc • Absolutely necessary for DC2004; • File Catalog must provide controlled access to its data (private group, user directories) ; In fact we need a full analogue of a distributed file system

File Catalogue (Replica database) (2) • We should look around for possible solutions : • Existing ones (AliEn, RLS) : • Will have Grid services wrapping soon ; • Will comply with the ARDA architecture eventually; • Large development teams behind (RLS, EGEE ?) • This should be coupled with the whole range of the data management tools: • Browsers; • Data transfers, both scheduled and on demand; • I/O API (POOL, user interface). This is a huge enterprise, and we should rely on using one of the available systems

File Catalogue (Replica database) (3) • Suggestion is to start with the deployment of the AliEn FileCatalogue and data management tools: • Partly done; • Pythonify the AliEn API: • This will allow developing GANGA and other applicationplugins; • Should be easy as the C++ API (almost) exist. • Should be interfaced with the DIRAC workload management (see below); • Who ? CPPM group, others are very welcome; • Where ? Install the server at CERN. • Follow the evolution of the File Catalogue Grid services (RLS team will not yield easily !); This is a huge enterprise, we should rely on using one of the available systems

Workload management (1) • The present production service is OK for the simulation production tasks ; • We need more: • Data Reprocessing in production (planned); • User analysis (spurious); • Flexible policies: • Quotas; • Accounting; • Flexible job optimizations (splitting, input prefetching, output merging, etc) ; • Flexible job preparation (UI) tools ; • Various job monitors (web portals, GANGA plugins, report generators, etc); • Job interactivity; • …

Workload management (2) • Possibilities to choose from: • Develop the existing service ; • Use another existing service ; • Start developing the new one. • Suggestion – a mixture of all of choices: • Start developing the new workload management service using existing agents based infrastructure and borrowing some ideas from the AliEn workload management: • Already started actually (V. Garonne); • First prototype expected next week; • Will also try OGSI wrapper for it (Ian Stokes-Rees); • Keep the existing service as jobs provider for the new one.

Workload management architecture GANGA Workload management Production service Optimizer 1 Job Receiver Optimizer 1 Optimizer 1 Command line UI Job DB Site Agent 1 CE 1 Job queue Match Maker Agent 2 CE 2 Agent 3 CE 3

Workload management (3) • Technology: • JDL job description; • Condor Classad library for matchmaking; • MySQL for Job DB and Job Queues; • SOAP (OGSI) external interface; • SOAP and/or Jabber internal interfaces; • Python as development language; • Linux as deployment platform. • Dependencies: • File catalog and Data management tools: • Input/Output sandboxes; • CE • DIRAC CE ; • EDG CE wrapper.

Conclusions • Most experiment dependant services are to be developed within the DIRAC project: • MetaCatalog (Job Metadata Catalog); • Workload management (with experiment specific policies and optimizations); • Can be eventually our contribution to the common pool of services. • Get other services from the emerging Grid services market: • Security/Authentication/Authorization, FileCatalog, DataMgmt, SE, CE, Information,… • Aim at having DC2004 done with the new (ARDA) services based architecture • Should be ready for deployment January 2004

Grid services based architectures