500 likes | 624 Views
SPARQL Query Language for RDF. Agenda. Introduction Graph Patterns Query Execution and Ordering Query Forms Testing Values SPARQL Support. The XML approach is to "wrap" each data item in start/end tags. <Aircraft> <wingspan> 14.8 meters </wingspan>
E N D
Agenda • Introduction • Graph Patterns • Query Execution and Ordering • Query Forms • Testing Values • SPARQL Support
The XML approach is to "wrap" each data item in start/end tags <Aircraft> <wingspan>14.8 meters</wingspan> <weight>512 kilograms</weight> <cruise-speed>70 knots</cruise-speed> <range>400 nautical miles</range> <description> medium-altitude, long-endurance unmanned aerial vehicle </description> </Aircraft> RQ-1.xml
Why use OWL? • The purpose of this document is to describe the role that OWL plays in data interoperability. [Note: this is not the only use of OWL, but it is an important one.] • Contents: • Understanding Syntax versus Semantics • An example that shows why standardizing syntax is necessary but not sufficient • Migrating from defining semantics on a per-application basis to standardized semantics
Syntax versus Semantics • Syntax: the structure of your data • e.g., XML mandates that you structure your data by "wrapping" each data item within a start tag and an end tag pair, with the end tag being preceded by / and both tags in <…> brackets. • That is, XML specifies the syntax of your data. • Semantics: the meaning of your data • Two conditions necessary for interoperability: 1. Adopt a common syntax: this enables applications to parse the data. XML provides a common syntax, and thus is a critical first step. 2. Adopt a means for understanding the semantics: this enables applications to use the data. OWL provides a standard way of expressing the semantics.
What is this XML snippet talking about, i.e., what are the semantics? What is a Predator? <Predator> … </Predator>
Predator - which one? • Predator: a medium-altitude, long-endurance unmanned aerial vehicle system. • Predator : one that victimizes, plunders, or destroys, especially for one's own gain. • Predator : an organism that lives by preying on other organisms. • Predator: a company which specializes in camouflage attire. • Predator: a video game. • Predator: software for machine networking. • Predator: a chain of paintball stores.
Meaning (semantics) applied on a per-application basis Semantics: A Predator is type of Aircraft. Actions: These actions must be performed on the Predator data: - identify ground control station. - determine onboard sensors. - determine ordnance. <Predator> … </Predator> application
Meaning (semantics) applied on a per-application basis XML app#1 Semantics: Code to interpret the data Action: Code to process the data app#2 Semantics: Code to interpret the data Action: Code to process the data
Problem with attaching semantics on a per-application basis Problems with burying semantic definitions within each application: - Duplicate effort - Each application must express the semantics - Variability of interpretation - Each application can take its own interpretation - Example: Mars probe disaster - one application interpreted the data in inches, another application interpreted the data in centimeters. - No ad-hoc discovery and exploitation - Applications have the semantics pre-wired. Thus, when new data (e.g., new type of aircraft) is encountered an application may not be able to effectively process it. This makes for brittle applications. application Semantics: Code to interpret the data Action: Code to process the data What's a better approach?
Better approach:(1) Extricate semantic definitions from applications (2) Express semantic definitions in a standard vocabulary XML app#1 Action: Code to process the data app#2 Action: Code to process the data OWL Document Semantic Definitions
OWL provides an agreed-upon vocabulary for expressing semantics A Sampling of the OWL Vocabulary: subClassOf: this OWL element is used to assert that one class of items is a subset of another class of items. Example: Predator is a subClassOfAircraft. FunctionalProperty: this OWL element is used to assert that a property has a unique value. Example: sensorID is a FunctionalProperty, i.e., sensorID has a unique value. equivalentClass: this OWL element is used to assert that one Class is equivalent to another Class. Example: Platform is an equivalentClass to Aircraft.
OWL Enables Machines to Understand Data! Semantics OWL XML/DTD/XML Schemas Syntax OWL enables machine-processable semantics!
Ontology (definition) • An Ontology is the collection of semantic definitions for a domain. • Example: an Aircraft Ontology is the set of semantic definitions for the Aircraft domain, e.g., • Predator is a subClassOfAircraft. • sensorID is a FunctionalProperty. • Platform is an equivalentClass to Aircraft.
Why use OWL? • Benefits to application developers: • Less code to write (save $$$). • Less chance of misinterpretation (save $$$). • Benefits to community at large: • Everyone can understand each other's data's semantics, since they are in a common language. • OWL uses the XML syntax to express semantics, i.e., it builds on an existing technology. • Don't have to learn new syntax. • Common XML tools (e.g., parsers) can work on OWL. • OWL will soon be a W3C recommendation.
XQuery Databases • XQuery is an XML query language. • It can be used to efficiently and easily extract information from Native XML Databases (NXD). • It can be used to query XML views of relational data.
Database Systems supporting XQuery • The following database systems offer XQuery support: • Native XML Databases: • Berkeley DB XML • eXist • MarkLogic • Software AG Tamino • Raining Data TigerLogic • Documentum xDb (X-Hive/DB) • Relational Databases: • IBM DB2 • Microsoft SQL Server • Oracle
Native XML Database (NXD) • Native XML databases have an XML-based internal model, i.e., their fundamental unit of storage is XML.
Introduction • RDF – flexible and extensible way to represent information about WWW resources • SPARQL - query language for getting information from RDF graphs. It provides facilities to: • extract information in the form of URIs, blank nodes, plain and typed literals. • extract RDF subgraphs. • construct new RDF graphs based on information in the queried graphs • matching graph patterns • variables – global scope; indicated by ‘?‘ or ‘$‘ • query terms – based on Turtle syntax • terms delimited by "<>" are relative URI references • data description format - Turtle
Graph Patterns Basic Graph Pattern – set of Triple Patterns Group Pattern - a set of graph patterns must all match Value Constraints - restrict RDF terms in a solution Optional Graph Patterns .- additional patterns may extend the solution Alternative Graph Pattern – two or more possible patterns are tried Patterns on Named Graphs - patterns are matched against named graphs
Basic Graph Pattern • Set of Triple Patterns • Triple Pattern – similar to an RDF Triple (subject, predicate, object), but any component can be a query variable; literal subjects are allowed • Matching a triple pattern to a graph: bindings between variables and RDF Terms • Matching of Basic Graph Patterns • A Pattern Solution of Graph Pattern GP on graph G is any substitution S such that S(GP) is a subgraph of G.
Basic Graph Pattern - Multiple Matches Data Query Group Graph Pattern (set of graph patterns) also! Query Result
Basic Graph Pattern - Blank Nodes Data Query Query Result
Graph Patterns Basic Graph Pattern – set of Triple Patterns Group Pattern - a set of graph patterns must all match Value Constraints - restrict RDF terms in a solution Optional Graph Patterns .- additional patterns may extend the solution Alternative Graph Pattern – two or more possible patterns are tried Patterns on Named Graphs - patterns are matched against named graphs
Graph Patterns Basic Graph Pattern – set of Triple Patterns Group Pattern - a set of graph patterns must all match Value Constraints - restrict RDF terms in a solution Optional Graph Patterns .- additional patterns may extend the solution Alternative Graph Pattern – two or more possible patterns are tried Patterns on Named Graphs - patterns are matched against named graphs
Value Constraints Data Query Query Result
Graph Patterns Basic Graph Pattern – set of Triple Patterns Group Pattern - a set of graph patterns must all match Value Constraints - restrict RDF terms in a solution Optional Graph Patterns .- additional patterns may extend the solution Alternative Graph Pattern – two or more possible patterns are tried Patterns on Named Graphs - patterns are matched against named graphs
Optional graph patterns Data Query Query Result
Multiple Optional Blocks Data Query Query Result
Graph Patterns Basic Graph Patterns – set of Triple Patterns Group Patterns - a set of graph patterns must all match Value Constraints - restrict RDF terms in a solution Optional Graph Patterns .- additional patterns may extend the solution Alternative Graph Patterns – two or more possible patterns are tried Patterns on Named Graphs - patterns are matched against named graphs
Alternative Graph Patterns Data Query Query Result
Graph Patterns Basic Graph Pattern – set of Triple Patterns Group Pattern - a set of graph patterns must all match Value Constraints - restrict RDF terms in a solution Optional Graph Patterns .- additional patterns may extend the solution Alternative Graph Pattern – two or more possible patterns are tried Patterns on Named Graphs - patterns are matched against named graphs
RDF Dataset • RDF data stores may hold multiple RDF graphs: • record information about each graph • queries that involve information from more than one graph • RDF Dataset in SPARQL terminology • the background graph, which does not have a name, and zero or more named graphs, identified by URI reference • the relationship between named and background graphs: • (i) to have information in the background graph that includes provenance information about the named graphs (the application is not directly trusting the information in the named graphs ) • (ii) to include the information in the named graphs in the background graph as well.
RDF Dataset- The Relationship between Named and Background Graphs (I)
RDF Dataset- The Relationship between Named and Background Graphs (II)
Query Execution and Ordering • Optional-1: an optional pattern that has a common variable with a(more) basic graph pattern(s) must be executed after the basic graph pattern(s) • Optional-2: there can't be two optionals with a common variable, if that variable does not occur in a basic graph pattern as well • Constraint: constraints are evaluated after variables are assigned values
Query forms: • SELECT • returns all, or a subset of the variables bound in a query pattern match • formats : XML or RDF/XML • CONSTRUCT • returns an RDF graph constructed by substituting variables in a set of triple templates • DESCRIBE • returns an RDF graph that describes the resources found. • ASK • returns whether a query pattern matches or not.
CONSTRUCT Examples(I) PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX vcard: <http://www.w3.org/2001/vcard-rdf/3.0#> CONSTRUCT { <http://example.org/person#Alice> vcard:FN ?name } WHERE { ?x foaf:name ?name }
CONSTRUCT Examples(II) accesing a graph conditional on other information contained in the metadata about named graphs in the dataset
Testing Values • Named functions and syntactically constructed operations: • operands: subset of XML Schema DataTypes {xsd:string, xsd:decimal, xsd:double, xsd:dateTime} and types derived from xsd:decimal. • Subset of XPath functions and operators • Operands: xs:string, xs:double, xs:float, xs:decimal, xs:integer, xs:dateTime • additional operators: sop:RDFterm-equal, sop:bound , sop:isURI, sop:isBlank, sop:isLiteral, sop:str , sop:lang, sop:datatype, sop:logical-or, sop:logical-and • Type Promotion : xs:double, xs:float, xs:decimal • each of the numeric types is promoted to any type higher in the above list when used as an argument to function expecting that higher type.
Support for SPARQL • SPARQL and Jena • module called ARQ that implements SPARQL; also parses queries expressed in RDQL or its own internal language. • not yet part of the standard Jena distribution; availbale from either Jena‘s CVS repository or as a self-contained download • Twinkle • simple Java interface that wraps the ARQ SPARQL Processor library (the add-on to Jena). • Redland • set of free software packages that provide support for RDF, including querying with RDQL and SPARQL using the Rasqal RDF Query Library. .