1 / 12

GridPP2: Data and Storage Management

GridPP2: Data and Storage Management. Gavin McCance - University of Glasgow Jens Jensen - RAL GridPP9, NeSC, Edinburgh. GridPP2 Middleware Data and Storage Management. Work areas. UK metadata management group Storage management. Metadata Management.

lois-duke
Download Presentation

GridPP2: Data and Storage Management

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. GridPP2: Data and Storage Management Gavin McCance - University of Glasgow Jens Jensen - RAL GridPP9, NeSC, Edinburgh

  2. GridPP2 MiddlewareData and Storage Management

  3. Work areas • UK metadata management group • Storage management

  4. Metadata Management • The focus is upon Grid-enabling metadata services for the experiments • Building upon our previous work in this area • Building upon experiments’ existing work in this area • Formation of a UK metadata group with GridPP2 • 1 generic Grid metadata post @ Glasgow • ~1 post per experiment • ATLAS @ Glasgow, LHCb @ Oxford, CMS @ Bristol/ICUS expts, others?? • These posts were described yesterday – the UK metadata group should form part of their work • Input from the UK data management support teams

  5. GridPP2 Metadata Group • Purpose will be to • Take overall responsibility for common experiment metadata technologies in order to Grid-enable the experiments’ metadata • Identify the commonalities and experience across experiments and make sure these are recognized • i.e. technologies, schema: data product navigational problem • Come to agreement and feed this back into the wider ARDA process • Work directly with interested groups forming the ARDA • EGEE JRA1 Data Management Group (@CERN) • LCG Deployment Teams (@CERN) • LCG Experiments • IT Database group (@CERN)

  6. Metadata Responsibilities • Generic metadata post: • Concentration on the technologies used to create scalable, manageable and fault-tolerant metadata services • The underlying Grid software stack • Emphasis upon the service, not just the product • 24/7 supportable production services • Not prescribing things like the schema, or saying the ‘API must look like Spitfire’: prototype interfaces should be based upon experiments’ existing metadata interfaces • Will track, develop and adopt as necessary Grid metadata access standards • Feed into standards to make sure we’re in a position to benefit from the future production products that implement these standards • Feed PPE use-case and experience back into the wider world

  7. Metadata Responsibilities • Experiment metadata posts (~1 per experiment): • Document existing implementations from the experiments and make sure all the experiments’ use-cases are satisfied by the products and the technologies being proposed by the group • Work within the group to ensure that commonalities and experience across experiments are recognized and effort is not wasted • At the technology level – e.g. using the same underlying Grid software stack • At the interface level – e.g. GANGA • Possibly at the schema level… • Feed this understanding and agreement back into the wider ARDA process and back into their own experiments • ARDA terminology: Dataset metadata  ARDA Metadata service Data product navigation  ARDA Job Provenance service

  8. Storage Management • Two areas of work (based at RAL) • SRM interface to UK storage sites • Site local data management

  9. SRM interface to UK Storage • Initial deliverable will be to provide an SRM (Storage Resource Manager) v1 interface to the Atlas DataStore at RAL • Subsequent migration to the more advanced features offered by e.g. SRM v2 • Perform an analysis of the UK Tier-2 storage sites and how these can be exposed via the common SRM interface • Implementation of SRM interfaces these storage systems • Deployment on all the Tier-2 sites and support • Contribution to the SRM standardisation process • Work closely with the EGEE JRA1 and LCG deployment groups • Work with support staff for Tier-1 and Tier-2

  10. Site-local Data Management • Management of data and files within a site • How you access the grid storage from the worker nodes • Cleanup of volatile data resources that a job no longer needs (Tier2) – cache management • Evaluation of existing technologies • dCache, SAM, EDG Zambo prototype, Condor, … • Development and deployment of these local data management solutions (@ Tier-2) • Interaction with Tier-2 site managers is vital • Feed back solutions into LCG / EGEE

  11. GridPP2 SupportData and Storage Management

  12. Data Management Support • UK data management support posts • Aim: to provide first-level support for all DM software • first stop for UK system administrators • Work directly with the development and deployment teams (GridPP2, EGEE and LCG) • Provide hands-on deployment help for data challenge support • Develop how-to portal to collect deployment experience • Feed back sys-admin issues and experience to developers • Site policies, quotas, firewalls – survey sysadmins • Develop site validation tools • Responsible for developing the overall support plan for the data management services beyond GridPP2 • Need to fit all this in with the rest of the UK Support Plan

More Related