240 likes | 356 Views
WP5 Storage Element. John Gordon CLRC e-Science Centre. StorageElement. SE sits in front of traditional storage provides a Grid interface SE sits behind Grid Services like Replica Manager and future datacutters and object servers SE is accessible directly by jobs and users. RC. RM. SE.
E N D
WP5 Storage Element John Gordon CLRC e-Science Centre
StorageElement • SE sits in front of traditional storage • provides a Grid interface • SE sits behind Grid Services like Replica Manager • and future datacutters and object servers • SE is accessible directly by jobs and users
RC RM SE HSM
RM RC-LHCb RC-CMS RM SE HSM
Rest of Grid LFN/PFN RepCat JS GDMPv2 GDMPv2 CloseCE IS Local Site localCE HSM rfiod Month 9
An SE in Testbed1 • A disk server • running GDMP v2.0 • running information providers • running data servers (RFIO, GridFTP) • optionally NFS mount user data on local CE • optionally stage replicated data to MSS
Testbed1 • GDMPv2 stores data on disk • Optional exits to stage to/from HSM • Supported protocols NFS and RFIO • neither of which are grid-enabled • this works because jobs are local to SE • GSI-enabled RFIO will allow remote Ces • Can query Information Service for information on SE and files
Coming Soon • EDG Replica Manager • Distributed Replica Catalogue • Integrated SE
CE RepMan JobSched Data Info Control NetMon Disk HSM SE
Interfaces • Data • file access put/get/delete • Posix-like open/close/get/put/seek • Information • on SE - size, space free, policies, CEs nearby • on files - size, timestamps, latency • Control • reservation, pinning - hints to HSM
Protocols • GridFTP as line protocol for moving data • GGF protocol, used by other Grids • RFIO as API • used by many tools and systems already • other APIs can be used or developed using supported protocols • but model supports any protocol agreed between client and server • AFS, HTTP,
DataGrid Filenames • Logical filename LFN • user view lfn://cms.data/sim/2002/file23 • mapped one to many by RepCat into... • Physical FileName PFN • pfn://se01.cern.ch/cms/sim/2002/file23 • pfn://se23.fnal.gov/abc.2/sim01 • identifying SE • SE publishes protocols it supports • Client takes PFN and protocol and produces • TFN • http://se01.cern.ch/cms/sim/2002/file23 • /afs/fnal.gov/datagrid/abc.2/sim01
CE RepMan JobSched Data Info Control NetMon Disk HSM SE
Remote Access • The SE will be accessible from anywhere by anyone.... • ...if policies and practices allow.
CE CE RepMan RepMan JobSched JobSched Data Info Control Disk HSM SE
New Services • There is scope for many new services to be developed • Data moving or processing services should be implemented close to the data (ie the SE) • Access to the data can be optimised if we have hints on how it is to be accessed. Please talk to us about possible special uses.
DataCutter ObjectServer Event Server Data Info Control SE Disk HSM
Clients SEArchitecture Top layer Interface 1 Interface 2 Interface 3 Session Manager MetaData Core Message Queue System Log House Keeping Bottom layer Storage Element MSS Interface MSS Interface MSS1 MSS2
Interface Layer The Core and the Bottom Layer 2 Interface Queue Manager Network 3 1 Request Manager Pipe Manager 4 Handler Disk MSM Pipe Store Named Pipe 6 Named Pipe Tape 5 Named Pipe 8 7 Named Pipe WP5 - Overview • The Storage Elephant (Element) • MSS access. • GridFTP/GridRFIO/OGSA • Castor/Atlas Data store/HPSS
Architecture • SOAP style Architecture • XML • Support for clustering. • Simple Design • Multi process model. • Minimum of IPC. • Portable Code. • Technologies used • Unix/Linux • C/C++ • Unix named/unnamed pipes, • Globus toolkit (Security, GridFTP …)
Progress • Short-term • Existing testbed described earlier • Publish API in June • Medium-term • Gridftp server interfaced to an HSM • Long-term • Integrated StorageElement Manager • Prototype of Core handlers exists, backend to disk almost ready • First Version for EDG Testbed 2 - September
Members of WP5 • UK run project • Co-operation between CLRC eScience and PPD • Based at RAL • By September will have written first version of SE • Front end (T.Eves/J. Jensen) • Support GridFTP and /grid • Intercepting file system calls from demons • Convert to XML for back end • Back end (R.Tam/O.Synge) • approximately 100 modules, 2-3 libraries • Interfacing with Castor / HPSS / Atlas Datastore
Summary • You will use the StorageElement to access your data... • ....but you won’t use it to find your data • Mainly used by Replica Manager, Job Scheduler, etc • You will only need SE metadata for diagnosis • If you have new protocols to suggest, let us know.