120 likes | 266 Views
GridPP2: Data and Storage Management. Gavin McCance - University of Glasgow Jens Jensen - RAL GridPP9, NeSC, Edinburgh. GridPP2 Middleware Data and Storage Management. Work areas. UK metadata management group Storage management. Metadata Management.
E N D
GridPP2: Data and Storage Management Gavin McCance - University of Glasgow Jens Jensen - RAL GridPP9, NeSC, Edinburgh
Work areas • UK metadata management group • Storage management
Metadata Management • The focus is upon Grid-enabling metadata services for the experiments • Building upon our previous work in this area • Building upon experiments’ existing work in this area • Formation of a UK metadata group with GridPP2 • 1 generic Grid metadata post @ Glasgow • ~1 post per experiment • ATLAS @ Glasgow, LHCb @ Oxford, CMS @ Bristol/ICUS expts, others?? • These posts were described yesterday – the UK metadata group should form part of their work • Input from the UK data management support teams
GridPP2 Metadata Group • Purpose will be to • Take overall responsibility for common experiment metadata technologies in order to Grid-enable the experiments’ metadata • Identify the commonalities and experience across experiments and make sure these are recognized • i.e. technologies, schema: data product navigational problem • Come to agreement and feed this back into the wider ARDA process • Work directly with interested groups forming the ARDA • EGEE JRA1 Data Management Group (@CERN) • LCG Deployment Teams (@CERN) • LCG Experiments • IT Database group (@CERN)
Metadata Responsibilities • Generic metadata post: • Concentration on the technologies used to create scalable, manageable and fault-tolerant metadata services • The underlying Grid software stack • Emphasis upon the service, not just the product • 24/7 supportable production services • Not prescribing things like the schema, or saying the ‘API must look like Spitfire’: prototype interfaces should be based upon experiments’ existing metadata interfaces • Will track, develop and adopt as necessary Grid metadata access standards • Feed into standards to make sure we’re in a position to benefit from the future production products that implement these standards • Feed PPE use-case and experience back into the wider world
Metadata Responsibilities • Experiment metadata posts (~1 per experiment): • Document existing implementations from the experiments and make sure all the experiments’ use-cases are satisfied by the products and the technologies being proposed by the group • Work within the group to ensure that commonalities and experience across experiments are recognized and effort is not wasted • At the technology level – e.g. using the same underlying Grid software stack • At the interface level – e.g. GANGA • Possibly at the schema level… • Feed this understanding and agreement back into the wider ARDA process and back into their own experiments • ARDA terminology: Dataset metadata ARDA Metadata service Data product navigation ARDA Job Provenance service
Storage Management • Two areas of work (based at RAL) • SRM interface to UK storage sites • Site local data management
SRM interface to UK Storage • Initial deliverable will be to provide an SRM (Storage Resource Manager) v1 interface to the Atlas DataStore at RAL • Subsequent migration to the more advanced features offered by e.g. SRM v2 • Perform an analysis of the UK Tier-2 storage sites and how these can be exposed via the common SRM interface • Implementation of SRM interfaces these storage systems • Deployment on all the Tier-2 sites and support • Contribution to the SRM standardisation process • Work closely with the EGEE JRA1 and LCG deployment groups • Work with support staff for Tier-1 and Tier-2
Site-local Data Management • Management of data and files within a site • How you access the grid storage from the worker nodes • Cleanup of volatile data resources that a job no longer needs (Tier2) – cache management • Evaluation of existing technologies • dCache, SAM, EDG Zambo prototype, Condor, … • Development and deployment of these local data management solutions (@ Tier-2) • Interaction with Tier-2 site managers is vital • Feed back solutions into LCG / EGEE
Data Management Support • UK data management support posts • Aim: to provide first-level support for all DM software • first stop for UK system administrators • Work directly with the development and deployment teams (GridPP2, EGEE and LCG) • Provide hands-on deployment help for data challenge support • Develop how-to portal to collect deployment experience • Feed back sys-admin issues and experience to developers • Site policies, quotas, firewalls – survey sysadmins • Develop site validation tools • Responsible for developing the overall support plan for the data management services beyond GridPP2 • Need to fit all this in with the rest of the UK Support Plan