130 likes | 287 Views
Measurement Data Object Descriptor Specification - Present Status. Giridhar Manepalli Corporation for National Research Initiatives. Measurement Data - Metadata. Measurement Data Object Descriptor (MDOM) describes I&M data captured or streamed during GENI experiments
E N D
Measurement Data Object Descriptor Specification - Present Status Giridhar Manepalli Corporation for National Research Initiatives
Measurement Data - Metadata • Measurement Data Object Descriptor (MDOM) describes • I&M data captured or streamed during GENI experiments • Data from various phases & levels of an experiment • I&M Data Object is loosely defined as an aggregation of data units that is self contained and makes sense by its own • MDOM classified into • Identifiers • Descriptors • Holders
MDOD • Identifiers (mandatory elements) • Identifier • Primary and any secondaries • URN or other kinds • Source (as in owner/holder of the object) • Note: Primary ID must be a URN • Domain:sub-domain+object_type+object_name • Object_name is unique for a given domain & sub-domain
MDOD • Identifier (optional elements) • Title • Abstract • Subject • Keywords • Annotations
MDOD • Descriptors • Descriptor (mandatory elements) • Level (numbering scheme to support descriptions of both objects & their components within the same MDOD) • Object Type (stored file, stream, etc.) • Slice ID • Locator • View (scope of availability) • Type (via web or file system, etc.) • Object format (perf sonar api, oml db, gui, etc.) • Interpretation method • Is Encrypted
MDOD • Descriptor (optional elements) • Collection • Geographic location • Start and end date • Project Id • Experiment Id • Run Id • Target • Category • Encryption method • Annotations
MDOD • Holders • Holder (mandatory elements) • Id • Order • Domain • Sub-domain • Slice Id • User Id • Contact Info • Collection & inheritance status • Anonymization • Sharing • Disposal • Transactions (what & when)
MDOD • Holder (optional elements) • Collection policy • Sharing Policy • Anonymization Method • Annotations on transactions
Good Things • Excellent start • Collaborative Specification • Great Coverage • Nicely broken down into elements • Mandatory vs. optional elements identified • Genuine Use Cases • Gathering, transferring, and sharing
Zurawski’s Comments • Too many secondary identifiers • Descriptors should be contextualized • Variations based on the type of object • GENI specific descriptions should be clearly marked and separated • Slight changes to names & enclosing elements recommended
Metadata Practices • Too many optional elements • Too many choices given to users • Users bound to take the path of least resistance • Keep the scope restricted to only mandatory elements – at least in the beginning • Try those out. Implement them. • One size fits all ---- No! • Capturing descriptions, formats, policies, transactions, etc. in a monolithic fashion • Register individual components separately • E.g., Capture legal formats & interpretations in their own records, and reference them here • E.g., Same with accepted policies • Identifiers cannot be semantic • Domain, sub-domain, and object-type are part of an ID • World view changes frequently • Non-semantic Ids are worth every penny • Search engines & registries mask the opaqueness • After all, IDs are just entities behind the scenes
Metadata Practices • Object Type controlled vocabulary enumerates apples and oranges • Collection, flow, directory, file, database, gui are not mutually exclusive • Doesn’t help the recipient make any decision looking at the descriptor • Bundle type & format into format interpretation method • Covers too many corner cases, e.g., flow-rate • Expects too many details, e.g., locator (type, access method, etc.)