370 likes | 474 Views
Speeding up ontology creation of scientific terms. Luis Bermudez , John Graybeal, Montery Bay Aquarium Research Institute http://marinemetadata.org December 7, 2005. Why are ontologies important. At AGU we have 31 abstracts and 2 entire sessions related to ontologies.
E N D
Speeding up ontology creation of scientific terms. Luis Bermudez , John Graybeal, Montery Bay Aquarium Research Institute http://marinemetadata.org December 7, 2005
Why are ontologies important At AGU we have 31 abstracts and 2 entire sessions related to ontologies
Problem: Semantic Interoperability SSDS AOSN get me Data for Parametertemperature_1 (deg C) get me Data for Variableocean_temperature (C)
Need for controlled vocabulary A set of restricted words, used by an information community when describing resources or discovering data. The controlled vocabulary prevents misspellings and avoids the use of arbitrary, duplicative, or confusing words that cause inconsistencies when cataloging data.
Problem: Semantic Interoperability Standard vocabularies semantics semantics
Harmonization HTML Comma Separated Values Tab Separated Values DTD Web Ontology Language (OWL) XML/XSD Relational Database RDF
Web Ontology Language: OWL • 2003 World Wide Web Consortium recommendation to formally express ontologies. • Based on the Resource Description Framework (RDF). • Can be serialized in XML. • Supporting tools: JENA, Protégé, SWOOP, Sesame, Pangloss, Kuwari, VINE, Voc2OWL
Fast introduction to OWL • RDF Triples • RDF Resources • Classes - individuals - properties • RDF Graph
RDF: Resource A resource is anything on the Web that has a unique identifier. Examples: • URI: urn:aosn.mbari.org.recordVariable.id:1900 • URL: http://mmi.org/2005/08/gcmd-keyw#Chlorophyll • URL: ftp://mmi.org/data-example Resources Literal
Classes Individuals Properties Looks like a class Property (Attributes) Looks like individuals of (members of) the class Parameter
How are ontologies created? • Conceptual direction strategy: • Up - down • Bottom - up • Automation approach: • Manual • Automatic
Lake River Bottom - up approach Example: 1. Properties of real world objects are identified. 2. Similarities are identified. 3. Concepts are created 4. and are expressed as a class. 5. Classes are related. Is inland body Has a relative defined channel Has water Body of Water Class Subclass Lake River
Bottom - up approach Example: Real word objects: parameters in observatory systems. They all have similar properties (id, description and units). Make them a resource: instance of a class Parameter ssds:Parameter rdf:type aosn:Variable
Bottom - up approach (cont.) sweet:Property mmi:Parameter ssds:Parameter aosn:Variable
Manual (Ontology editor) List of more than 50 editors: http://www.xml.com/2002/11/06/Ontology_Editor_Survey.html Protégé
Automatic transformation Properties file Software Program Ontology in OWL
Automatic • Advantages • Fast • Preserves a connection with the source ( back - compatibility ) • Avoids typing and copy/paste errors • Disadvantage • Only works with simple vocabularies ( Flat vocabularies, and some taxonomies)
VOC2OWL • Tool created by MMI • Allows to create automatic - bottom -up ontologies from two basic structures of simple vocabularies: • Flat vocabularies (e.g. phone directory) • Hierarchical vocabularies (e.g. taxonomies) • JAVA - Eclipse standalone application
Conversion Properties I/O Format of the ASCII file to transform: tab or csv Location of the ASCII file Location where the ontology in OWL will be saved
Ontology Conversion Properties One class (at least) is always created. Namespace of the resources More than one class can be created Column from where the local names of the resources (individuals) will be created.
Ontology Conversion Properties If treated as a hierarchy, there is no such primary class. All the lines in the ASCII file represent a hierarchy
Has been tested ! About 50 vocabularies were converted to OWL for the MMI workshop “ Advancing Domain Vocabularies” (Aug, 2005)
Why do we need all these ontologies ? Workshop was about relating terms from one controlled vocabulary to another one. Microsoft Excel was to hard to use for this purpose -:)
Mapping results 47 participants and 12 hours of mapping time
More… • Advance the Marine Knowledge: 250,000 RDF triples (Ontologies + mappings) • They are available as: • SOAP web services at: http://marinemetadata.org/webservices • Ontology files at: http://marinemetadata.org/ns
Conclusions • Solving semantic interoperability issues is fun. • We need to relate data producers vocabularies with standard vocabularies. • OWL is growing and growing in popularity more and more tools will be available. • VOC2OWL can help you !
Our Guides Executive Committee • John Graybeal, MBARI. (PI) • Philip Bogden, SURA/SCOOP • Stephen Miller, SIO. • Francisco Chavez, MBARI. • Stephanie Watson, Texas A&M Steering Committee • Roy Lowry, BODC • Robert Arko, LDEO • Julie Bosch, NOAA • Ben Domenico, Unidata • Karen Stocks, SDSC • Steve Hankin, NOAA - Ocean.US/DMAC • Mark Musen, Stanford Univ • Michael Parke, Univ of Hawaii • Lola Olsen, NASA Goddard • Bob Weller, WHOI • Dawn Wright, Oregon State University
MMI:Your Handy Reference Guide MMI: http://marinemetadata.org Voc2OWL: http://marinemetadata.org/voc2owl Vine: http://marinemetadata.org/vine Help Line: ask@marinemetadata.org Ontologies: http://marinemetadata.org/ns Term Search: http://mmi.mbari.org:9600/mmi2/search.jsp Tethys: http://marinemetadata.org/tethys