1 / 12

Minimal Metadata for Data Services Through DIALOGUE

Minimal Metadata for Data Services Through DIALOGUE. Neil Chue Hong AHM2007. It’s Good To Talk. Data Integration Applications: Linking Organisations to Gain Understanding and Experience (DIALOGUE) EPSRC supported sister project network grant running from 2005-2007

elvis
Download Presentation

Minimal Metadata for Data Services Through DIALOGUE

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Minimal Metadata for Data Services Through DIALOGUE Neil Chue Hong AHM2007

  2. It’s Good To Talk • Data Integration Applications: Linking Organisations to Gain Understanding and Experience (DIALOGUE) • EPSRC supported sister project network grant running from 2005-2007 • Stimulate discussion between people involved in data access and integration • http://www.datagrids.org

  3. Minimum Requirements for Information Exchange • Requirements for agreements so that information can be effectively interchanged between DAI technologies. • Identification of data sources • Description of data sources • Identification of data • Description of data representations • What’s the least information I need?

  4. Service Types • Core Services / Baseline Services • essential to infrastructure • security, registry, index, discovery • Data Services / Data Access Services • exposes a queryable data resource • Analytical Services / Data Processing Services / Computational Services • provide operations that act on data • Data Transfer Services • provide transfer of data between endpoints • Data Storage Services • provide management of data, inc replication

  5. Interoperability Points between Components • Compatible naming for services • Compatible naming for data objects • Managed data transfer between any two endpoints • Data formats • Data discovery

  6. Searching for data • Standard way of accessing a metadata catalogue • Standard format for describing a data resource • sufficient to access it • access protocols supported • description of security policies • transfer protocols supported • available replicas • QoS policies • sufficient to understand what’s in it • quality of data • provenance • sufficient to choose the right source • quality of service • productivity, availability, responsiveness, reliability, accessibility

  7. Service name Service ID? Does this evolve too quickly? Is this only useful for resolution? Should this be described elsewhere? Service version Service owner Service maintainer Service description: human readable summary Service types implemented Minimal set of annotation on operations to allow discovery Minimal set of management information? Link to service policies (including security, QoS)? Should we use WSDL and WS-MetadataExchange Core Set of Metadata for all our Data Services

  8. Extra Metadata for Data Access Services • Access protocols supported • query languages supported • result representations supported • Description of security policies • Transfer protocols supported • Available replicas • QoS policies • Is this DAIS? • Extensions for each type: SQL Data Access Services, DICOM, XPath, CQL, etc.

  9. Extra Metadata for Data Transformation Services • Schema mapping • store schema maps (e.g. A->B) rather than schema • but no current agreed way of representing a schema map • allow schema maps to be discovered • allow optimisation over maps e.g. A->B->C => A->C if present • Quality and trust of third party schema maps

  10. Core set of Metadata for representing a data object / record • Identifier • Structure / representational format • Provenance data • Human readable description • Extended sets for • relational data • XML data • file collections • Feature based processing, content based processing

  11. Core set of information to store in a provenance record for data integration • Who created • When created • Which service created • Service configuration parameters (inc inputs) • But how do you evolve this information?

  12. What needs to be agreed • What does need to be agreed to make different services interoperate? • DIALOGUE allowed discussion amongst software engineers involved in data integration • paper to be published • But look at Dublin Core… • … and the many formats for addresses

More Related