1 / 10

The Metadata Perspective

The Metadata Perspective. Peter Kunszt CERN GGF10 PNPA Workshop, Berlin. Overview. Metadata – what is it? An overview. Lessons learned Requirements Suggestions. Definition of Metadata. Metadata is data too! ~ Descriptive data

honey
Download Presentation

The Metadata Perspective

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Metadata Perspective Peter Kunszt CERN GGF10 PNPA Workshop, Berlin

  2. Overview • Metadata – what is it? An overview. • Lessons learned • Requirements • Suggestions

  3. Definition of Metadata Metadata is data too! ~ Descriptive data • Describe the data itself: what is the data about, parameters, characteristics, statistics, .. • Describe methods: algorithms, input/output parameters, .. • Describe middleware: service data, parameters, configuration, versions, owner, .. • Describe authentication and authorization: user lists, passwords, access control lists, tokens, .. • Describe modeling: UML diagrams, database schemata, .. • Describe history: provenance data, who has generated what data using what method, .. • Describe virtualization: virtual data generation parameters, pipelining • Describe operation: logging and monitoring, ..

  4. Aspects of Metadata Bound to a context. • Semantics specific to the context. • Usage patterns specific to the semantics • Requirements specific to the context and semantics What is the context? • Application data – e.g. metadata on HEP events • Middleware specific – e.g. service description (storage, computing…) • Virtual Organization, resource provider – e.g. security policies • Logging and monitoring – e.g. LDAP, MDS, R-GMA, ..

  5. Where is it? • Explicitly in the context (like the job description language, input output files, etc). Not the topic here, I talk about MD catalogs. • Dedicated catalog in application space. Examples • CMS RefDB • Atas Metdata Interface AMI • BaBaR Metadata Catalog • Dedicated catalog in middleware space • MCAT (virtual data catalog) • EDG Replica Metadata Catalog • VO management service • Metadata Grid Interface provisioning to existing catalogs • OGSA-DAI • Spitfire

  6. How was it used to date? Lessons learned • Dedicated services like CMS RefDB work well. • Generic one-size-fits-all metadata catalogs are not used as much. (RMC) • Frameworks are hard to adopt and to use (Spitfire) • Lack of dedicated catalogs may lead to the abuse of monitoring and information services. • The boundary between application and middleware layer is blurred Conclusions • The narrower the context the better • Everyone doing their own metadata is good, BUT • Everyone defining a proprietary interface is bad • User controllable metadata is good

  7. Requirements – ideas • Metadata catalogs must have a clear context • Differentiation between the grid middleware and application layer • Commonalities to be standardized on: • Common security mechanisms • Common exposure of interfaces (WSDL) • Common mechanism of describing the data content (like common methods to expose the schema) • Common query mechanisms • Common error reporting (SOAP Faults) • Catalogs should be able to call each other • Users should be able to store their own metadata (e.g. big success of SDSS SkyServer MyDB)

  8. A Metadata Scenario • Virtual files / virtual collection concept (from HEPCAL) Query Interface needs standardization Metadata Catalog Virtual MD Query Result FileList File Catalog

  9. Suggestions on how to proceed • Accept Web Service interfaces as the common base interface framework • Define interfaces inside application- and middleware-specific domains based on existing services and the specific needs of the given community. • Identify missing interfaces or required interfaces from clients and users • Compare the interfaces at a common forum (like this) • Define how to proceed: Factor out commonalities or standardize commonalities. The aim is to be interoperable. Propagate findings to groups in GGF wherever relevant, spawn new working group with a very specific focus! • Iterative process..

  10. Conclusion • Metadata is closely tied to context and the semantics thereof. • Generic metadata services vs. specialized services: • Generic service to store key-value pairs might be useful to users to store their own data (exploit DAIS) • Try to use common mechanisms for security, discovery, query and error reporting. • Suggestion to work on specialized services, solving a well-understood problem of a user community. Identify commonalities as a second step (bottom up approach) • Maintain a good communication between metadata service providers – GGF can be the forum for this.

More Related