430 likes | 550 Views
Extended Metadata Registry (XMDR). ISO/IEC JTC 1/SC 32/WG 2. November 2004. Bruce Bargmeyer +1 (510) 495-2905 bebargmeyer@lbl.gov. Topics. XMDR project direction What, when, who, how XMDR relationship to WG 2 projects A view of semantics based computing
Extended Metadata Registry (XMDR) ISO/IEC JTC 1/SC 32/WG 2 November 2004 Bruce Bargmeyer +1 (510) 495-2905 bebargmeyer@lbl.gov
Topics • XMDR project direction • What, when, who, how • XMDR relationship to WG 2 projects • A view of semantics based computing • Some specific semantics challenges • Some technology choices • A preview of issues to be raised for Parts 2 and 3 • Frank Olken will describe some of the content and ontology issues and approaches
XMDR Project Direction • Extend the capabilities of 11179 metadata registries to register complex metadata structures (concept structures, terminologies) • Ontologies, Graphs, Taxonomies, Thesauri, … This presentation will use the term “concept structures” as synonymous with “complex metadata structures” • Extend the capabilities of 11179 metadata registries to record correlations and interrelations between data (e.g., data elements & domains) and other concept structures. • Extend the capabilities of 11179 metadata registries to record correlations and interrelations between the various concept structures themselves.
XMDR & SC 32/WG 2 • Propose draft text for 11179 Part 2, Version 3 • Preview November 2004 WG 2 meeting • Proposals April 2005 WG 2 meeting • Propose issues for 11179 Part 3, Version 3 • Test & demo extended 11179 capabilities in a reference implementation • Tests & demo starting March 2005 • Register concept structures nominated by participants
XMDR Project Direction • Part of an Interagency/International Cooperation on Ecoinformatics • Sponsors & participants: EPA, EEA, USGS, DOD, NCI, Mayo Clinic • Extend semantics management capabilities for ISO/IEC 11179 • Produce design for next generation of operational ISO/IEC 11179 registries • Test & demo extended 11179 capabilities in a reference implementation • Research, develop, evaluate, adapt, extend, and demonstrate techniques and technologies for semantics based computing • Facilitate early adoption of these technologies • Establish best practices for semantic web and semantic based computing Forging Semantics Based Computing
Project Direction - Ecoinformatics Information science and information technology for the environment • Sound information as the basis for environmental policy, decisions, and action • Information technology that supports and enables development of sound information • Facilitate interaction with environmental information • Human - Computer • Computer - Computer
People Involved • XMDR Project at LBNL (Ecoinformatics +) • LBNL: Bruce Bargmeyer, Frank Olken, Kevin Keck, John McCarthy (ret. consulting) • DOD: Nancy Lawler & Sam Chance • EPA: Larry Fitzwater, Howard Tsai, Linda Spencer, William Sonntag • USGS: Gail Hodge (IIA, for USGS) (Lisa Zolly, USGS, is joining L8) • Mayo Clinic: Harold Solbrig • NCI: Sherri De Coronado, Denise Warzel • SC 32/WG 2, INCITS-L8 • Ashton Computing & Management: Judith Newton • Farance Inc. (consulting)
XMDR Liaison Activities • OMG Ontology Development Metamodel (ODM) • W3C Semantic Web Best Practices and Deployment Working Group • Ecoterm • EU Joint Research Center (EDEN-IW) • National Science Foundation (Ecoinformatics) • Interagency/International Cooperation on Ecoinformatics
Project Direction • What the project is not: • An attempt to make 11179 metadata registries be a development and maintenance facility for every type of concept structure • An attempt to standardize the complete range of terminology servers
XMDR Relationship to WG 2 Projects - Introduction Real World Metamodel constructs: CDIF Core, MOF, XML-Schema, RDFS, ODM, OWL, Common Logic (CL) Modeling Tools Methodologies/tools: EDR, NIAM, O-O, RDF, UML, Ontology Model Artifacts and Exchange CDIF, (UML: MDL, XMI), OWL, CL & SCL (KIF) Applications SQL (relational), Object, Semantic Web (might be agent based, grids, etc.
Real World Metamodel constructs: CDIF Core, MOF, XML-Schema, RDFS, ODM, OWL, CL MDR Semantics Management: Data elements, Domains, Concepts, Terms … Modeling Tools Methodologies: EDR, NIAM, O-O, RDF, UML, Ontology Model Artifacts and Exchange CDIF, (UML- MDL, XMI), OWL, CL & SCL (KIF) Applications SQL, Object, RDF, Semantic Web 11179 Semantics Management
Change 11179 Metadata Registry MDR – Keeping Track of the Real World Present ANSI Industry Gov’t CZ Czech Republic** LO Slovakia** ISO CZ Czechoslovakia* Country changes Past Country
Real World MDR Semantics Management: Data elements, Domains, Concepts, Terms … Modeling Tools Model Artifacts and Exchange Applications SC 32 Standards & Projects WG2 - 20943 WG 2 – 11179 Metamodel constructs: CDIF Core, MOF, XML-Schema, RDFS, ODM, OWL, CL WG 2 - MMF (19763), CL (24707) & MOF PAS submission Methodologies: EDR, NIAM, O-O, RDF, UML, Ontology CDIF, (UML- MDL, XMI), OWL, CL & SCL (KIF) WG 2 - MMF (19763), CL (24707) & XMI PAS submission WG 2 - 20944 SQL, Object, RDF, Semantic Web WG 3 - SQL
Real World MDR Semantics Management: Data elements, Domains, Concepts, Terms … Modeling Tools Model Artifacts and Exchange Applications XMDR Focus XMDR project WG2 - 20943 WG 2 – 11179 Parts 2 & 3 Metamodel constructs: CDIF Core, MOF, XML-Schema, RDFS, ODM, OWL, CL WG 2 - MMF (19763), CL (24707) & MOF PAS submission Methodologies: EDR, NIAM, O-O, RDF, UML, Ontology CDIF, (UML- MDL, XMI), OWL, CL & SCL (KIF) WG 2 - MMF (19763), CL (24707) & XMI PAS submission WG 2 - 20944 XMDR project XMDR project SQL, Object, RDF, Semantic Web WG 3 - SQL
Real World MDR Semantics Management: Data elements, Domains, Concepts, Terms … Modeling tool Model artifact/ exchange Application A Current Example Ontology Works Inc. (OWI) IODE data modeling tool Domain ontology expressed in Simple Common Logic (based on Draft ISO/IEC 24707 OWI Knowledge Server Application
Real World MDR Semantics Management: Data elements, Domains, Concepts, Terms … Modeling tool Model artifact/ exchange Application Another Current Example Protégé ontology tool Domain ontology expressed in As an OWL ontology OWI Knowledge Server Application, possibly built on Objectivity as the persistent object Store (DBMS)
Semantics Based Computing • What is it? • Evolution of semantics management • Evolution of technologies that utilize semantics
Semantics based computing • Computation based on the meaning of data rather than on the manipulation of syntactic structures.
Observation Global Ontology Station U nit Determinant AnalyticalFraction TimeStamp Medium Local Ontology NERITime NERIObservationCharacteristics NERIStation Table(x) Table(y) Table(z) Table(m) Local DB Schema Semantic Mapping 19
Metadata RegistriesSemantics Management Evolution Initial “data standards”, evolved to stronger semantics management • Common data across information systems (data standards) • Database (schema) integration • Data use - metadata • Warehouse support – schema and metadata • XML support (schema) • “Backed into” concept/terminology support (deeper semantics) • Next: Semantics servers -- for semantic web and semantics based computing
Past, Present, … Future? EEA DOE text text data data text data ambiente agricultura tiempo salud hunano industria turismo tierra agua aero environ agriculture climate human health industry tourism soil water air environ agriculture climate human health industry tourism soil water air DoD 123 345 445 670 248 591 308 123 345 445 670 248 591 308 3268 0825 1348 5038 2708 0000 2178 3268 0825 1348 5038 2708 0000 2178 123 345 445 670 248 591 308 123 345 445 670 248 591 308 3268 0825 1348 5038 2708 0000 2178 3268 0825 1348 5038 2708 0000 2178 123 345 445 670 248 591 308 123 345 445 670 248 591 308 3268 0825 1348 5038 2708 0000 2178 3268 0825 1348 5038 2708 0000 2178 text data environ agriculture climate human health industry tourism soil water air EPA 123 345 445 670 248 591 308 123 345 445 670 248 591 308 3268 0825 1348 5038 2708 0000 2178 3268 0825 1348 5038 2708 0000 2178 text data 3268 0825 1348 5038 2708 0000 2178 123 345 445 670 248 591 308 ambiente agricultura tiempo salud huno industria turismo tierra agua aero 123 345 445 670 248 591 308 3268 0825 1348 5038 Others . . . Users Lots of users Lots of information systems Lots of Data Sources
Data Standards • Avoid a combinatorial explosion of data content, description, and metadata arrangements for information access and exchange. Data standards and metadata registries can help.
Data Element Concept Name: Country Identifiers Context: Definition: Unique ID: 5769 Conceptual Domain: Maintenance Org.: Steward: Classification: Registration Authority: Others Afghanistan Belgium China Denmark Egypt France Germany ………… Data Elements Afghanistan Belgium China Denmark Egypt France Germany ………… AFG BEL CHN DNK EGY FRA DEU ………… 004 056 156 208 818 250 276 ………… Name: Context: Definition: Unique ID: 4572 Value Domain: Maintenance Org.: Steward: Classification: Registration Authority: Others Name: Context: Definition: Unique ID: 3820 Value Domain: Maintenance Org.: Steward: Classification: Registration Authority: Others Name: Context: Definition: Unique ID: 1047 Value Domain: Maintenance Org.: Steward: Classification: Registration Authority: Others ISO 3166 English Name ISO 3166 3-Alpha Code ISO 3166 3-Numeric Code
Afghanistan Belgium China Denmark Egypt France Germany ………… AFG BEL CHN DNK EGY FRA DEU ………… 004 056 156 208 818 250 276 ………… 11179 Metadata Registry
Then there is one point of access to our environmental data resources: Separate Environmental Media Legislation Separate Regs/ Procedures Regulated Facility Complete Warehouse Repository Separate Data Repositories State Regs Fed Air Reg Fed Water Reg Fed RCRA Reg Fed TSCA Reg “ “ State Laws CAA CWA RCRA TSCA “ “ Public/ Environmental Regulators/ Environmental Community Regulated Facility June 1996
Data and Semantics Management DBMS/XML/ Documents Dictionary DataElements Keyword Thesaurus Semantic Web Terms Ontology Concepts 11179 Metadata Registry
Metadata Registry TerminologyThesaurus Themes Ontology GEMET Data Standards Structured Metadata ISO/IEC 11179 Metadata Registries Evolving toward stronger semantics management
Users World Wide Web Companies Data Services Metadata Registries Universities Environmental Data Grid Semantic Services TerminologyThesaurus Ontology Taxonomy Computation Services Agencies Data Standards Structured Metadata Others Environmental Semantics Grid Software: Models, Visualization, Analysis Agent systems Semantic Based Computing Environmental Computer Grid High Performance, cluster, Personal September 2004
Users World Wide Web Companies Data Services Metadata Registries Universities Environmental Data Grid Semantic Services TerminologyThesaurus Ontology Taxonomy Computation Services Agencies Data Standards Structured Metadata Others Environmental Semantics Grid Software: Models, Visualization, Analysis Agent systems Semantic Based Computing Environmental Computer Grid High Performance, cluster, Personal September 2004
What is it ? [2] Semantics based computing: Applications that take the meaning of data into account to direct the processing. • Establish linkage between concepts referenced in text and related data in databases • Semantic Web • Support agent-based development of actionable data, for informed decision making.
Some Challenges • Translate the 11179 UML model into an ontology, manually. • Translate the 11179 UML model into an ontology, automated. • Identify emerging technology for building reference implementation, develop architecture • Identify test concept structures and sources • Characterize concept structures • Identify extensions needed for 11179 • Propose extensions needed for 11179
Manual TranslationUML 11179 to an Ontology • Use Protégé tool and OWL specification • Frank will show tell all about it
Automated Translation 11179 UML Metamodel to an OWL Ontology Part 3 metamodel as Rational Rose UML MDL file
ISO/IEC 11179Expressed as an Ontology <?xml version="1.0" encoding="ISO-8859-1"?> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#" xmlns:owl="http://www.w3.org/2002/07/owl#" xmlns="http://www.owl-ontologies.com/unnamed.owl#" xml:base="http://www.owl-ontologies.com/unnamed.owl"> <owl:Ontology rdf:about=""/> <owl:Class rdf:ID="Registrar"> <rdfs:subClassOf rdf:resource="http://www.w3.org/2002/07/owl#Thing"/> <rdfs:subClassOf> <owl:Restriction> <owl:cardinality rdf:datatype="http://www.w3.org/2001/XMLSchema#int" >1</owl:cardinality> <owl:onProperty> <owl:ObjectProperty rdf:ID="contact"/> </owl:onProperty> </owl:Restriction> </rdfs:subClassOf> <rdfs:subClassOf> <owl:Restriction>
Potential Standards/Technologies • DBMS • Object, XML, Relational, RDF/Graph, Logic, Text, Document, Multimedia • Knowledge Representation • Web Ontology Language (OWL) • Simple Common Logic (SCL) • Middleware/Messaging • Cocoon 2, Jini, CoABS, JMS, XMLBlaster, SOAP • XML [Semantic] Web Services • Axis, JWSDP • Agent Development • ABLE, JADE • Engines/Servers • OMS (IBM), Federator/OMS (OWI) • Jess
Content andContent Characterization A B B B C C C C C Directed Acyclic Graphs, Cyclic, Undirected, … Frank Olken to tell about this.
PreviewSuggested Changes for P2/P3 (cont.) • Issue 1. Make "relation" an administered item. The relationship could be managed as part of the structure in which they are involved. Alternatively, in Clauses 4.10 and 4.11, possibly treat the subject role as an aggregate association. This is an alternative way of administering relationships, more in line with current practice.
PreviewSuggested Changes for P2/P3 (cont.) • Issue 2. Rename the "horizontal" role and association names in Clause 4.7.3, Figure 3. E.g., Value_Domain should not have the role "representing" going in two directions. The association name "data_element_representation" may be impacted by the change in role names. Note that the role between the top two boxes is labeled "having" and "specifying", while the role between the bottom two boxes is labeled "represented_by" and "representing". The relationship between the upper two boxes and bottom two should be symmetric. Also, "having" could better be "specified_by". (We also have possible alternate proposals for labels.)
Preview of Issues (cont.) • Issue 3: Identify the types of correspondences between concepts. The point is to record the types of overlap between the concepts. An alternative is translation tables, which record pairs of IDs linking concepts without any more specific "type" information.
PreviewSuggested Changes for P2/P3 • Issue 4. Directed Relationships as replacement to "association" and "related to" in Clauses 4.10 and 4.11. Note that in Clause 4.11, "concept_relation" is directed, but no inverse is specified. • Issue 5. Replace the "string" value to a relation instance. This applies to • Clause 4.10: clasification_scheme_item_relationship_type_description • Clause 4.11: data_element_concept_relationship_type_description concept_relationship_type_description
Next Year: Proposed • Service Oriented Architecture
Eighth International Open Forum on Metadata RegistriesSemantic Interoperability: Where Meaning Meets Metadata.Open Forum 2005 April 11-14, 2005 Berlin, Germany Berlinopenforum.de