190 likes | 279 Views
A Really Brief Crash Course in Semantic Web Technologies. Rocky Dunlap Spencer Rugaber Georgia Tech. Languages you may encounter. XML (eXtensible Markup Language) XML Schema XPath (navigate an XML document) XQuery (query an XML document)
E N D
A Really Brief Crash Course in Semantic Web Technologies Rocky Dunlap Spencer Rugaber Georgia Tech
Languages you may encounter... • XML (eXtensible Markup Language) • XML Schema • XPath (navigate an XML document) • XQuery (query an XML document) • XSLT (Extensible Stylesheet Language Transformations) • RDF (Resource Description Framework) • RDF Schema • OWL (Web Ontology Language) • SPARQL (Query language for RDF triples) • SQL (Structured Query Language – for RDBMS) • UML (Unified Modeling Language – conceptual) • SKOS (Simple Knowledge Organization System) – glossary
XML • General purpose markup language • Mechanism for structured data exchange between heterogeneous systems • Basically: elements (tags) and attributes • Not really for human consumption, although it is easy for us to read and write in small amounts • An XML file is often called an instance document
XML Schema • Defines the allowed structure of a set of instance documents • Defines a set of “types” -- valid chunks of XML • Typically the schema is defined up front and applications are written to process valid or schema-conforming instance documents • The schema is a way to achieve standardization – like a contract • “If you provide a valid document, we’ll provide you with tools that do X, Y, and Z.”
RDF • A knowledge representationlanguage • Conceptual in nature • It really has nothing to do with XML • But, there happens to be an XML representation • A way to make statements about pretty much anything you want: • “The Curator meeting is at GFDL.” • “The Curator meeting is Oct 18-19.” • “Balaji works at GFDL.”
RDF Statements “The Curator meeting is at GFDL.” Curator meeting hasLocation GFDL subject predicate object
RDF Statements “The Curator meeting is Oct 18-19.” Curator meeting hasLocation resource GFDL starts ends “18 Oct 2007” literal “19 Oct 2007”
RDF Statements “Balaji works at GFDL.” Balaji worksAt Curator meeting hasLocation GFDL starts ends “18 Oct 2007” “19 Oct 2007”
RDF XML Representation <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:esc="http://www.earthsystemcurator.org"> <rdf:Description rdf:about=“http://....#OctCuratorMeeting"> <esc:hasLocation rdf:resource=“http://....#GFDL”/> <esc:starts>18 Oct 2007</esc:starts> <esc:ends>19 Oct 2007</esc:ends> </rdf:Description> <rdf:Description rdf:about=“http://....#Balaji"> <esc:worksAt rdf:resource=“http://....#GFDL”/> </rdf:Description> </rdf:RDF>
RDF Schema • Define a domain specific data model for RDF • Includes classes and properties (along with subclasses and subproperties) • Properties are first class (they are not defined as part of a particular class)
RDF Schema Classes Properties hasLocation domain: Event range: Place starts domain: Event range: date ends domain: Event range: date worksAt domain: Person range: Place Event Flight Meeting Person Place
OWL (Web Ontology Language) • Builds on RDF by adding increased expressivity • Every OWL file is RDF (but not necessarily the reverse)
RDF vs. OWL OWL Property constraints -allValuesFrom -someValuesFrom -hasValue RDF Classes Subclasses Properties Subproperties Individuals Cardinality constraints on properties -cardinality (exact) -minCardinality -maxCardinality Class definitions -intersection -union -complement -equivalentClass -disjointWith -oneOf (enum) Transitive Properties Symmetric Properties Individuals -sameAs -differentFrom
Things you can NOT say in RDF, but can say in OWL • The class TriangularUnstructuredGrid is at the intersection of TriangularGrid and UnstructuredGrid • UnstructuredGrid is the complement of StructuredGrid • A Dataset is generated by exactly one Model • A Model is made up of at least one Component • An AtmosphereComponent is a Component with ScienceType equal to “Atmosphere” • X subComponent Y, Y subComponent Z X subComponent Z
Things you can NOT say in RDF, but can say in OWL • The class Model is equivalent to ConfiguredModel • ScienceType is the exact enumeration Atmosphere, Ocean, Ice, and Land • ObservationDataset is disjoint from SimulationDataset • Dataset123 is the same object as DatasetXYZ
SPARQL • A language for querying RDF/OWL triples • Example query: PREFIX foaf: <http://xmlns.com/foaf/0.1/> SELECT ?x ?name WHERE { ?x foaf:name ?name }
Curator’s Current Strategy • Curator data model written in XML Schema • Models and Datasets (Resources*) annotated with conforming XML instancedocuments • Portions of XML translated into RDF and exposed by CDP-Curator faceted search • This means: • Low level details remain in XML instance • Higher level concepts pulled out into the RDF • Can we confirm this strategy?
Technical Challenges • XML to RDF translation • Hierarchical, low level graph-based, conceptual • Is there a need to go from RDF back to XML? • What stays in XML? What goes to RDF? • Automation of translation • Schema level (e.g., schema evolution) • Instance level (e.g., submission of new resource to CDP-Curator)