60 likes | 234 Views
Naming and Metadata. Jamie Shiers Application Software and Databases Group Information Technology Division CERN http://wwwinfo.cern.ch/asd/cernlib/rd45/index.html. Introduction. Naming and Metadata in FATMEN Naming on the Web Naming for LHC data. FATMEN Naming.
E N D
Naming and Metadata Jamie Shiers Application Software and Databases Group Information Technology Division CERN http://wwwinfo.cern.ch/asd/cernlib/rd45/index.html
Introduction • Naming and Metadata in FATMEN • Naming on the Web • Naming for LHC data Jamie.Shiers@cern.ch
FATMEN Naming • Think of it as a file catalog • no special characters, case insensitive • Each “file” is identified by a Unix-style path name • //CERN/L3/PROD/DATA/PDRE/CC045QX2 • //CERN/DELPHI/P01_ALLD/MDST/PHYS/Y95V02/SUMT/C0515 • “Nicknames” corresponding to sets of files • :NICK.RAWD91 • :GNAME.P01_ALLD/RAWD/NONE/Y91V00/*/R • :DESC.RAW data of ALL events; 1991 data Jamie.Shiers@cern.ch
FATMEN Metadata • 0.5KB metadata per filename • DSN, hostname, data representation, media type, location code • host type & OS details • VSN/VID/fseq, density, volseq • start/end record/block • record format, length, blocksize (DCB), filesize • creation / catalog / use date & time • creator username, account, job, node • protection mask • user words (10) & comment (80 bytes) • Largely irrelevant for ODBMS, except creation info (++) Jamie.Shiers@cern.ch
Web • By convention, most sites located by http://www.name.com [.org .gov .int .ch] • OK for (very) high-level entry points • but is it www.british-airways.com, www.britishairways.com, www.britishair.com, www.ba.co.uk ? • www.altavista.com is not the altavista search engine • More complicated addresses best found by navigation • www.cern.ch R&D RD45 • or via search engines, book-marks etc. Jamie.Shiers@cern.ch
Naming for LHC Event Data • How many entities will need to be named? • Can one avoid naming e.g. the collection of rawdata corresponding to run 123 year 2007? • A simple naming scheme for such data may be sufficient • How many analysis collections will there be? • Naming is clearly insufficient to describe the data • and, in general, a poor way of finding it! • Job information (creator) should probably be an association to a persistent “job object” in the “production database” • Can other metadata be “standardised” or simply “tag+attribute” • re-use of “generic tag” concept (implementation?) Jamie.Shiers@cern.ch