220 likes | 308 Views
Information and GLUE schema. CASTOR ext’l conf CERN 13-15 Nov 2006. What is a Storage Element ?. Grid interface to storage Provides: Control protocol E.g., SRM 1.1 or SRM 2.2 Data transfer protocol E.g., GridFTP or RFIO Information service (this talk). Storage & The Grid…. RM. FPS.
E N D
Information and GLUE schema CASTOR ext’l conf CERN 13-15 Nov 2006
What is a Storage Element? • Grid interface to storage • Provides: • Control protocol • E.g., SRM 1.1 or SRM 2.2 • Data transfer protocol • E.g., GridFTP or RFIO • Information service (this talk)
Storage & The Grid… RM FPS PhEDEx Fireman lcg-* GFAL Computing Element FTS LFC srmcp SE SE SE GridFTP Storage
Why information service (theory) • To enable clients to discover: • Which protocols (control & data xfer) the SE supports • How much space is “available” and “used” (more…) • Storage with specific properties (more…) • Availability to my VO
Why information service (practice) • Low level SRM-only tool works without • srmcp • Higher level Grid tools cannot access SE • lcg-* • GFAL • SRM implementers implement SRMs • Sometimes without knowing higher level Grid software stack
What is information? • LDAP implementation • (In GT2; in GT3 OGSI, in GT4 WSRF) • Globus MDS (“Monitoring and Discovery Service”) • Each Element provides a GRIS (“Globus Resource Information Service”)
Crash Course in GLUE Storage Schema Version 1.2 Overview SE Control Protocol 1..* Access Protocol 1..* Policy SA 1..* State volatile,durable,permanent AccessControlBase In theory, each SA could publish multiple VOs In practice, each SA is published once per VO.
Crash Course in GLUE Storage Schema Version 1.3 Proposal Overview SE Control Protocol 1..* Access Protocol 1..* Policy SA 1..* State RetentionPolicy AccessLatency ExpirationMode 1..* VOInfo AccessControlBase
SEs in 1.3 • Status: Production, Draining, Closing, Queueing • Analogous to CE • Total online and nearline size • Implementation name and version • Can query grid to see who has upgraded
Types of storage • In GLUE 1.2 • FileLifetime: volatile, durable, permanent • In GLUE 1.3 • RetentionPolicy: replica, output, custodial • AccessLatency: offline, nearline, online • ExpirationMode: releaseWhenExpired, warnWhenExpired, neverExpire
LCG Storage Classes • Tape is custodial • But custodial doesn’t have to be tape • Tape is nearline (or offline) • Publishing LCG classes in 1.3 • Tape1Disk1online custodial • Tape0Disk1online replica or online output • Tape1Disk0nearline custodial
Accounting for “free” space • Multiple copies on disk recyclable? • Files with expired pins • “Volatile” files with expired lifetime • Deleted files or other gaps on tape • Race when used for selection (flocking) • Free != Available – across resources + != 2 1 1
Accounting for “used” space • Are deleted files “used” • Gap (on tape) may not be reclaimable • Disk overhead? • Tape1Disk1 counted twice? • Account for nearline and online separately • Multiple pinned copies are “used”? • Int’l optimisation vs DiskN for N>1
Accounting • Also publish reserved space (?) • Reserved but unused is not available • Total size • Also separate for nearline (tape) and online (disk) • Is Total = Available + Used + Reserved? • Not necessarily
Accounting • srmGetSpaceMetadata • “default” space for space that isn’t • TMetaDataSpace • RetentionPolicy • “Owner” • Total, guaranteed, used size • “Lifetime” (in seconds)
Implementation at RAL • First version accounted for Disk only (CK) • Fairly hairy query • Second query accounted for tape (JJ) • Queried vmgr db only • Assumes SAs do not share tape pools • Counts deleted files and disabled tapes as “used” • Counts compressed data • Even hairier query, not deployed yet
Implementation at CERN • Jean-Philippe wrote his own query • Uses the name server • Counting compressed files on tape
Implementation TODO • Needed since October ’05 • Adapt to new interpretations • of {available-free, used, reserved, total} • Decide how we map between: • LCG service classes (TapeMDiskN) • The StorageArea in 1.2 • The StorageArea in 1.3 • Service classes (and other internals)
Implementation TODO default CMS CMS default LHCb default CMS Atlas LHCb LHCb default default Atlas Atlas SA in 2.0? SA in 1.3 VOInfo in 1.3 SA in 1.2
Spaces in 1.3 information system • Select VOInfo: • In: VO name OR AccessControl (eg FQAN) • Out: Path OR space token descr • Select SA: • In: qualities • Out: find appropriate VOInfo • Missing: selection by protocol (in 2.0?)
Implementation TODO • The SA is a space • In 1.3, each SA publishes multiple VOs • Compatibility problem with SA.Path • Locate VOInfo for your SA • New version is VOInfo.Path • If clients need SA.Path, still need to publish each SA for each VO
Conclusion • Information is important ! • More work is needed • Test with higher level middleware • Track ongoing GLUE process