260 likes | 397 Views
Page 1. ACHIEVING SEMANTIC INTEROPERABILITY WITH HYDROLOGIC ONTOLOGIES FOR THE WEB. 6 th International Conference on HydroScience and Engineering. Michael Piasecki Luis Bermudez. Drexel University, College of Engineering. Page 2. Content. Overview of metadata
E N D
Page 1 ACHIEVING SEMANTIC INTEROPERABILITY WITH HYDROLOGIC ONTOLOGIES FOR THE WEB 6th International Conference on HydroScience and Engineering Michael Piasecki Luis Bermudez Drexel University, College of Engineering
Page 2 Content • Overview of metadata • Metadata Interoperability problems • A possible Solution: Hydrologic Ontologies for the Web Drexel University, College of Engineering
Page 3 Metadata • Answers: what, when, where, how, who and why of the described data. • Helps to: discover, access, evaluate and use of data. Metadata Creator : USGS Keyword:Gage Height Drexel University, College of Engineering
Page 4 Hydrologic Information Communities (HIC) need a metadata agreement What descriptors can be used ? Which possible values? Gage heightorwater elevation Keyword or topic Keyword or topic Author or Creator Author or Creator Is there any metadata agreementavailable to describe hydrologic data ? Drexel University, College of Engineering
Page 5 Metadata Specifications related to Hydrology • ISO-19115:2003 • FGDC-STD-001-1998 • Ecological Markup Language • Geographical Markup Language • USGS Hydrologic Markup Language • Earth Science Markup Language • Dublin Core Metadata Initiative Drexel University, College of Engineering
Page 6 Problem 1: Metadata specifications lack domain specific elements • For example: • They do not tell if area and outlet location should be defined when a watershed is being described • For example: • They do not incorporate a list of possible stations and variables related to surface water collected by a particular HIC • What is the problem with these ? Drexel University, College of Engineering
HIC A #23 #56 W #34 Stage height = Page 7 HIC A creates an HTML form to collect Metadata EX_GeographicIdentifier geographicIdentifier MD_Identifier code … Descriptive Keywords MD_Keywords keyword Citation … X #24 Not consistent Water elev. Drexel University, College of Engineering
#23 #34 #56 discharge stage height Page 8 Need to incorporate domain vocabulary to get consistent metadata EX_GeographicIdentifier geographicIdentifier MD_Identifier code … Descriptive Keywords MD_Keywords keyword Citation … consistent consistent Drexel University, College of Engineering
Page 9 Problem 2: Metadata standards do not solve Semantic heterogeneities search for: Stage Height Metadata (ISO) about dataset Xkeyword = Stage Height thesaurusName = GCMD Metadata (FGDC) about dataset YTheme_Keyword= Gage Height Theme_Keyword_Thesaurus= USGS Finds only data set X and not data set Y Metadata repository Drexel University, College of Engineering
Page 10 Possible solutions to our Problems How to incorporate domain vocabulary in metadata specifications? • Create a new metadata specification. • Rewrite a previous one and extend • Hardcode semantics into application • Dynamic Extension with ontologies Drexel University, College of Engineering
Page 11 Extending Metadata Specifications to meet specific needs of a HIC Express metadata specifications and vocabularies in ontologies. Use the knowledge inference capabilities of ontologies to link the metadata elements with selected vocabulary terms. Drexel University, College of Engineering
Lake River Page 12 Ontologies Specification of conceptualizations Example: 1. Properties of real world objects are identified. 2. Similarities are identified. 3. Concepts are created 4. and are expressed as a class. 5. Classes are related. Is inland body Has a defined channel Has water Body of Water Class Subclass Lake River Drexel University, College of Engineering
Page 13 Web Ontology Language : OWL <XML> W3C Recommendation since 02/2004 Body of Water River Lake </XML> <owl:Class>Body_of_Water</owl:Class> <owl:Class>River</owl:Class> <owl:Class>Lake</owl:Class> Drexel University, College of Engineering
<owl:Class> <owl:Class> </owl:Class> </owl:Class> Page 14 Metadata specs expressed in ontologies <XML> Classes object Properties datatype Properties MD_Metadata + fileIdentifier[0..1] : CharacterString + language[0..1] : CharacterString … + identificationInfo 1..* MD_Identification … + abstract : CharacterString … </XML> Drexel University, College of Engineering
Is Transitive Is part of Region Subregion Accounting Unit Cataloging Unit Mid Atlantic Delaware Is part of Lower Delaware Is part of Schuylkill Is part of Infer isPartOf Page 15 <XML> Class Hydrologic Unit Subclasses </XML>
C A B Page 16 More about knowledge Inference Y <Stationrdf:ID=“A"> </Station> <Stationrdf:ID=“B"> </Station> <Stationrdf:ID=“C"> </Station> <isPartOfrdf:resource=“#W”/> <isPartOfrdf:resource=“#W”/> <isPartOfrdf:resource=“#Y”/> W How toinferthe stationsthat are only inW? Program infer <owl:Class rdf:ID “W-Station” type of station that has property isPartOf = W </owl:Class> W-Stations = A, B Drexel University, College of Engineering
Metadata Specifications Domain Vocabularies C <XML> <XML> MD_Identifier W-station A + code: CharacterString … isPartOf = W </XML> B </XML> Restriction onProperty: code allValuesFrom : W-station MD_Identifier_Extension + code: CharacterString … Y code A B W Dynamic extension with ontologies Page 17 e.g. Restrict the descriptor code to only have W-station values Dynamic HTML form using the extension Program could infer
Page 18 Ontologies provide means to resolve Semantic Heterogeneities Drexel University, College of Engineering
Page 19 Use of ontologies to map metadata specifications <owl:Class rdf:ID = "&iso;MD_Keywords"> <owl:equivalentClass rdf:resource ="&fgdc;Keywords"/> </owl:Class> <owl:DatatypeProperty rdf:ID = "&iso;title"> <owl:equivalentProperty rdf:resource = "&fgdc;title“/> <owl:DatatypeProperty> Drexel University, College of Engineering
Page 20 Use of ontologies to solve semantic heterogeneities among different domain vocabularies <gcmd:Variable ="&gcmd;Stage_Height"> <owl:sameAs rdf:resource=“&noaa;stage"/> <owl:sameAs rdf:resource=“&usgs;gage_Height"/> <owl:differentFrom rdf:resource=“&events;Stage_Height"/> </gcmd:Variable> Drexel University, College of Engineering
Metadata specifications Hydrologic vocabulary Metadata (ISO) about dataset X keyword = Stage Height thesaurusName= GCMD FGDC GCMD Mapper Metadata (FGDC) about dataset Y Theme_Keyword = Gage Height Theme_Keyword_Thesaurus= USGS Mapper ISO USGS Page 21 Semantic Interoperability e.g. search for Stage Height Finds data set X and Y Metadata repository Drexel University, College of Engineering
Page 22 Why is XML Schema not good enough? Drexel University, College of Engineering
Page 23 XML Schema cannot express semantics. E.g. defining that a watershed has only one outlet location and only one unique identifier ..<xsd:elementref="watershed" type="watershedType" /> <xsd:complexTypename="watershedType"> <xsd:sequence> <xsd:elementref="outletLoc“ type="xsd:nonNegativeInteger” minOccurs="1" maxOccurs="1“/> <xsd:element ref=“id" type="xsd:nonNegativeInteger minOccurs="1" maxOccurs="1"/> Drexel University, College of Engineering
Page 24 XML Schema cannot express semantics … <watershed> <outletLoc>567</outletLoc> <name>X</name> <id>101</id> </watershed> … … <watershed> <outletLoc>838</outletLoc> <name>X</name> <id>101</id> </watershed> … Valid XML document Valid XML document X Semantically they are not correct 567 <> 838 XML Schema is good to validate the structure of a document, but not the semantics Drexel University, College of Engineering
Page 25 Hydrologic Ontologies will help to: • Extend standards • Solvesemantic heterogeneities • Interoperate between systems • e.g. Find a numerical model and data to compute runoff for a specific location with a specific resolution. • System Engineering benefits • Efforts are not duplicated because the conceptual models could be reused and shared. • Semantics not need to be hard coded in computer programs. Drexel University, College of Engineering
Page 26 Acknowledgements Drexel Team (Luis Bermudez, Saiful Islam, Bora Beran) Stephane Fellah (Member ISO TC 211 Canada team) will submit 19115 in OWL to ISO as a draft document NOPP NAG 13 0040 (Web based dissemination portal) NSF- GEO Directorate grant from EAR division to create Hydrologic Metadata for CUAHSI, prototype Hydrologic Information System (HIS), in the Neuse River Basin Discussion List : Protégé, Jena, W3C Drexel University, College of Engineering