530 likes | 787 Views
Metadata Standards. A presentation to IRMAC Metadata SIG by Ray Gates March 12, 2003 Toronto, Canada. Speaker Profile. Ray Gates Active in metadata standardization since 1987
E N D
Metadata Standards A presentation to IRMAC Metadata SIG by Ray Gates March 12, 2003 Toronto, Canada
Speaker Profile • Ray Gates • Active in metadata standardization since 1987 • Member of Canadian Advisory Committee to Standards Council of Canada for ISO/IEC JTC1/SC32 Data Management & Interchange • Head of Canadian Delegation from CAC/SC32 to ISO/IEC JTC1/SC32/WG2 Working Group on Metadata standards • Day job: IT Solutions Architect with Manulife Financial • mailto:gatesray@rogers.com
Basic Concepts • Metadata: • Data that is descriptive • Descriptive data about data • Metadata: • A “relative” term (like the relative term “above” or “below”) • No data is inherently “metadata” • Only “metadata” in relation to something (else) Credit: Frank Farance, Farance Inc.
Motivation for Metadata Standards • Two primary motivations: • Improving the quality of metadata and the data it describes • Improving inter-operability when using the metadata and the data it describes • Standards help achieve this by enabling • Consistent representation of data and metadata • Consistent interpretation of data and metadata
Categories of Metadata Standards • Framework/Architecture standards • Used primarily to guide the development of other standards • Metamodel/Technology standards • Used primarily to specify standard interfaces etc. for implementations • Data and Message Definition standards • Used to specify data representation and semantics • Both generic syntax, and specific instances
Standards Development Organizations(for example) DCMI UN/CEFACT ISO TC 46 ISO TC 211 OASISReg/Rep ISO TC 215 Dublin CoreElement Set GeographicInformation ebXMLRegistry HealthInformatics W3C Open Forum on Metadata Registries XML Intelligent Transportation Systems UDDI UDDI Terminology ISO/IEC11179MDR Metamodel Framework ISO TC 37 SQLCatalog ISO TC 204 MOF, UML, CWM, XMI Commerce OMG ISO/IEC JTC1 SC32 ISO TC154 Based on a slide by Bruce Bargmeyer, Lawrence Berkley National Laboratory
IRDS Framework • Published in 1990 • Was to be the basis of a series of IRDS standards, but vendor support evaporated • Basic concepts since incorporated into OMG’s Meta Object Facility (MOF)
IRDS Framework – Data Levels Source: ISO/IEC 10027:1990
Object Management Group • A not-for-profit software consortiumhttp://www.omg.org/ • “The OMG manages an open, vendor-neutral process which proposes technologies and invites proposals and invites feedback from any member company before coming to consensus on a final specification, which becomes an adopted standard.”
OMG’s Model Driven Architecture Source: ISO/IEC WD 19763-1 Framework for metamodel interoperability – reference model
Electronic Data Interchange • Legacy: ANSI X12 and EDIFACT • IFX Framework • Interactive Financial eXchange • Early proponent of XML for EDI • New: ebXML and Open EDI • Provide frameworks and generic formats • Industry-specific message formats are defined using the generic formats within the frameworke.g. see http://www.xml.org/xml/registry.jsp • Sponsored by OASIS and UN/CEFACT in liaison with ISO/IEC JTC1/SC32/WG1
OMG Metadata Standards • Meta Object FacilityTM (MOFTM) • The foundation for the other standards • Unified Modeling LanguageTM (UMLTM) • “a graphical language for visualizing, specifying, constructing, and documenting the artifacts of distributed object systems” • Common Warehouse MetamodelTM (CWMTM) • “CWM provides a framework for representing metadata about data sources, data targets, transformations and analysis, and the processes and operations that create and manage warehouse data and provide lineage information about its use.” • XML Metadata InterchangeTM (XMITM) • “a model-driven XML Integration framework for defining, interchanging, manipulating and integrating XML data and objects”
OMG Metadata Architecture *Levels correspond to IRDS Framework Levels
CWM Implementations (WIP) • Dimension EDI • Genesis Development • Meta Integration Technology Inc • Oracle Corporation • PrudSys XELOPES library for embedded data mining • SAS (search for "CWM") • UBS (implementation of Genesis) Source: http://www.omg.org/cwm/
MOF implementations • JavaTM Metadata Interface (JMI) • A mapping of MOF to the Java language • Developed by Unisys as part of the Java Community Process • A reference implementation is available.
ISO/IEC 11179 Overview • History • 1st Edition: Standardization and specification of data elements • 2nd Edition: Metadata Registries (in progress) • Associated projects • Tech reports on Metadata Content Consistency • Interoperability and Bindings Standard
ISO/IEC 11179 Overview • Part 1: Framework • An overview of the rest of the standard • Part 2: Classification • Part 3: Registry metamodel & basic attributes • Part 4: Formulation of data definitions • Part 5: Naming and identification • Part 6: Registration
ISO/IEC 11179 Core Model • Data element concept • Concept about data described independently of its representation • Value domain • Set of permissible values • Conceptual domain • Set of permissible values meanings • Data element • Data element concept + Value domain
ISO/IEC 11179 Core Model Data Element Concept Conceptual Domain Data Element Value Domain Specifying 0..N 1..1 Having Represented by 1..1 Expressed by 1..1 0..N Representing 0..N Expressing 0..N Representing Represented by 1..1
ISO/IEC 11179 Functions • General Registry Functions for: • Definition, Identification, and Naming • Administration • Classification • Specific Component Functions for: • Data Elements • Data Element Concepts • Value Domains • Conceptual Domains • Summary Credit: Doug Mann, US EPA
11179 Basic Attributes • Basic attributes are defined for use outside the context of a meta data registry • Name, Identifier • Context name, Context identifier • Definition, Definition language identifier • Registration status • Responsible organization • Submitting organization • and more
ISO/IEC 11179 MDR • Who is using it? • Mostly government agencies and a few large organizations • Can I buy a 11179 Metadata Registry? • Most have been home grown • Oracle Consulting has some specialists • http://www.unece.org/stats/documents/2000/11/metis/crp.7.e.pdf • Whitemarsh Metabase has some support • http://www.wiscorp.com/products_mb.htm • Farance Inc developing a reference implementation • http://farance.com/
ISO/IEC 11179 Users • Reference Implementations: • U.S. EPA’s System of Registries (SoR): http://www.epa.gov/sor/ • Australian Institute of Health and Welfare’s The Knowledgebase (NHIK):http://www.aihw.gov.au/knowledgebase/index.html • IEEE Intelligent Transportation Systems Data Registryhttp://standards.ieee.org/regauth/its/index.html • U.S. Federal Aviation Administration Data Registryhttp://fdr.faa.gov • Statistics Canadahttp://www.statcan.ca/english/freepub/11-533-XIE/about.htm#Integrated%20Metadata • Metadata Registry Implementation Coalition: • Contact: Genevieve Speier ( mailto:GSpeier@hcfa.gov ) Credit: Larry Fitzwater and Doug Mann, U.S. EPA
ISO/IEC 20944 Metadata RegistriesInteroperability and Bindings Requirements Functionality Conceptual Model Semantics Bindings: APIs Bindings: Codings Bindings: Protocols Encodings:Data Formats Encodings: VariousCommunication Layers Encodings:Calling Conventions A Framework for Harmonized/Consistent Bindings and Encodings Topic-SpecificInformative Wording Topic-SpecificNormative Wording Cross-TopicCodings: XML Various Standards Cross-Topic APIs:Normative WordingJava, JavaScript,C/C++, Perl, Tcl, VB Cross-Topic Protocolse.g.: Session Layers Cross-Topic APIsInformative Wording Various Standards Credit: Frank Farance, Farance Inc.
Types of Registries • ISO 11179 Metadata Registries – Data semantics • OASIS/ebXML (Organization for the Advancement of Structured Information Standards/electronic business XML) XML Registries – XML Artifacts • UDDI (Universal Description, Discovery, and Integration) Registries – Web-based business services • Database System Registries (System Catalogs/Data Dictionaries/ Repositories) – Schema, integrity & operational information • Case Tool Registries (Encyclopedias/Repositories) – Data model and application program logic • Ontological Registries – Concept structures • Software Component Registries – Software components • Dublin Core Registries – Descriptive records for information resources Credit: Bruce Bargmeyer, Lawrence Berkley National Laboratory
Registry Types ISO 11179Registries OASIS/ebXMLRegistries UDDIRegistries Common Content Common Content Common Content OntologicalRegistries Common Content Database Catalogs Software Component Registries Common Content Common Content Cooperation/ Interoperation CASE Tool Repositories Common Content Dublin Core Registries Common Content Credit: Bruce Bargmeyer, Lawrence Berkley National Laboratory
Dublin Core MetadataInitiative • Dublin Core Metadata Initiative • An open forum engaged in the development of interoperable online metadata standards • Dublin Core element set (DCES) • a standard for cross-domain information resource description, typically for electronic documents. • being processed through ISO TC 46 as DIS 15836 • The element set includes: • Title, Creator, Subject, Description, Publisher, Contributor, Date, Resource Type, Format, Resource Identifier, Source, Language, Relation, Coverage, Rights.
Dublin Core MetadataInitiative • Each element in the set is described using attributes based on ISO/IEC 11179-3:1994 • Required attributes: • URI, label, definition, type of term, status, date issued, • Optional attributes • comment, see, references, refines, qualifies • DCES often specified in HTML <meta> tag • E.g. <meta name= dc.title content=“My presentation”> • DCMI providing guidelines for using DCES in XML • E.g. <dc:title>My presentation</dc:title> • An XML Schema and RDF Schema are provided
Dublin Core Users • National Library of Canada • http://www.nlc-bnc.ca/ (Use View Source to see meta tags) • US Library of Congress • http://www.loc.gov/ • National Library of Australia • http://www.nla.gov.au/ (Use View Source to see meta tags) • National Science Foundation • http://www.nsf.org/ • European Commission Diffuse project • http://www.diffuse.org (Use View Source to see meta tags) • More… • See http://dublincore.org/about/participants/
Standards Registry Metadata • Specification for a Registry of Standards • Builds on Dublin Core Element Set to describe Standards documents, including the organization(s) that specified and or maintains the standards • Working Draft developed by a sub-committee of OASIS • Now passed to JTC1/SC32/WG2 for completion and processing within ISO/IEC
W3C Metadata Standards • XML DTD – Document Type Declaration • Part of the base XML standard • Allows the structure of a document type to be specified, so that document instances can be validated • Syntax of a DTD is not XML • All data must be character • Does not support XML Namespaces • a particular tag value can mean only one thing • makes it hard to share definitions
W3C Metadata Standards • XML Schema • Extends the capabilities of DTDs • Allows the structure of a document type to be specified, so that document instances can be validated • Syntax of XML Schema is XML • Supports predefined and user-defined datatypes • Supports XML Namespaces • tags can be prefixed with a namespace id, to differentiate otherwise identical tags
W3C Metadata Standards • RDF – Resource Description Framework • A language for representing metadata about web resources using XML syntax • Much more powerful than the HTML <meta> tag • Represents properties as name/value pairs associated with a named resource • First published 1999 – currently under revision
W3C Metadata Standards • Simple RDF Example <rdf:RDF> <rdf:Description about="http://metadata-stds.org/"> <dc:Title>Metadata Standards Home Page</dc:Title> </rdf:Description> </rdf:RDF>
W3C Metadata Standards • RDF Schema • Intended to provide a way of validating RDF statements • Allows types and values spaces to be associated with resources that may then be associated as properties of other resources
Other Web Metadata Standards • RDDL – Resource Directory Description Language • Allows metadata to be associated with hyperlinks in XML documents (XLINKs) • Allows the target resource to be described by: • A title, nature and purpose • The nature of a referenced resource is a property of the referenced resource itself • The purpose is a property of the link. • Some common natures and purposes have been defined
Other Web Metadata Standards • RELAX NG • Simplified version of XML Schema • Developed by OASIS • Combines two earlier OASIS standards: RELAX, TREX • RELAX NG validators are available
Web Services (WIP) • WSDL – Web Services Definition Language • A model and an XML format for describing web services, typically: • Definitions of request and reply messages • Instructions on how to access the service • WSFL – Web Services Flow Language • Proposal from IBM for orchestration flow across multiple web services • XLANG – Microsoft equivalent of WSFL
Web Services (WIP) • UDDI – Universal Description Discovery & Integration • To facilitate the discovery of web services, both at design time, and dynamically at run-time • UDDI Registry contains information on: • Business entities providing services • Classification information about the types and location of services provided • Details on how to invoke the services • http://www.uddi.org • Maintenance of the standard passed to OASIS • Implementations are available
Data Element Standards • Country codes (ISO 3166) • Languages codes (ISO 639) • Human sex codes (ISO/IEC 5218) • ISBN (ISO 2108) • Universal Product Code (UPC) • and many more
Industry Message Standards • ACORD – Insurance • BIPS - Bank Internet Payment System • HL7 – Healthcare • MFDS – Mutual Fund Data Systems • SWIFT – Financial services • and many more
Health Information Standards • Canadian Conceptual Health Data Model (CHDM) • a reference tool for organizing high-level health information and data. • CIHI Data Dictionary • standardized data elements, definitions and values for elements. • HL7 version 2.4 • an international standard for message format and content for communication among health systems • HL7 Clinical Context integration (CCOW) • an end user focused standard to facilitate the integration of software applications at point of use. • HL7 Clinical Document Architecture • provides a structured model for clinical documents making them readable to both machines and humans.
Appendix • The following slides are provided for reference purposes
Organization: ISO/IEC • ISO/IEC JTC1 Joint Technical Committee for Information Technology • ISO/IEC JTC1/SC32 Data Management & Interchange • WG1 – Open EDI • WG2 – Metadata • WG3 – Database Languages (SQL) • WG4 – SQL Multimedia • ISO/IEC JTC1/SC7 Software Engineering • ISO/IEC JTC1/SC22 Programming Languages
SC32/WG2 Metadata Standards • Information Resource Dictionary System (1990s) • IRDS Framework (Historical – still valid) • IRDS Services Interface (Not implemented) • Standardization of Data Elements • ISO/IEC 11179 1st Edition (1990s) • Metadata Registries (MDR) • ISO/IEC 11179 2nd Edition (2003-2004) • Technical reports on achieving content consistency (TR20943) • New project on Interoperability & Bindings (ISO/IEC 20944) • IT-enablement of widely used coded domains • ISO/IEC 18022 (in progress) • Framework for Metamodel Interoperability • ISO/IEC 19763 (new project)
SC32/WG2 Metadata Standards • ISO/IEC 10027:1990 IRDS Framework • ISO/IEC 10728:1993 IRDS Services Interface • ISO/IEC 11179-1: Data Elements – Framework • ISO/IEC 11179-2: Data Elements – Classification • ISO/IEC 11179-3: Data Elements – Basic Attributes • ISO/IEC 11179-4: Data Elements – Data Definitions • ISO/IEC 11179-5: Data Elements – Naming principles • ISO/IEC 11179-6: Data Elements – Registration • ISO/IEC 11179-3:2003 Metadata Registry (MDR) • ISO/IEC 20943 Technical Reports on Content Consistency • ISO/IEC 20944-n MDR Interoperability & Bindings • ISO/IEC 19763 Framework for Metamodel Interoperability