380 likes | 520 Views
SPARQL Intro: A query language for RDF. Alexandra Cristea & Matthew Yau. What is SPARQL. a query language for RDF(S) W3C Recommendation 15 th January 2008 provides a standard format for writing queries that target RDF data
E N D
SPARQL Intro:A query language for RDF Alexandra Cristea & Matthew Yau
What is SPARQL • a query language for RDF(S) • W3C Recommendation 15th January 2008 • provides a standard format for writing queries that target RDF data • and a set of standard rules for processing queries and returning the results • http://www.w3.org/TR/rdf-sparql-query/ • Read also: course website!
RDF Statements Predicate Subject Object • author Jan Egil Refsnes http://www.w3schools.com/RDF ?subject ?predicate ?object SPARQL searches for all sub-graphs that match the graph described by the triples in the query.
A sample of SPARQL (TURTLE notation) SELECT ?student WHERE { ?student b:studies bmod:CS328 } ~ SQL
Prefixes &namespaces In RDF: <rdf:RDFxmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:cd="http://www.recshop.fake/cd#"> In SPARQL: PREFIX b: <http://...> PREFIX bmod: <http://www2.warwick.ac.uk/fac/sci/dcs/teaching/material/> • The PREFIX keyword is SPARQL’s version of an xmlns:namespace declaration and works in basically the same way.
SPARQL basics • Different syntax from XML • Variables: begin with ? or $ • triples patterns enclosed within braces {} • Result: variables after SELECT (~SQL)
Combining conditions PREFIX b: <http://www2.warwick.ac.uk/rdf/> PREFIX bmod: <http://www2.warwick.ac.uk/fac/sci/dcs/teaching/material/> PREFIX foaf: <http://xmlns.com/foaf/0.1/> SELECT ?name WHERE { ?student b:studies bmod:CS328 . ?student foaf:name ?name }
FOAF • FOAF (Friend Of AFriend) • experimental project using RDF, defining a standardised vocabulary. • Goal: make personal homepages machine-readable& understandable, create an internet-wide connected database of people. Defined at: http://xmlns.com/foaf/0.1/
FOAF • Idea: most personal homepages contain similar info. • E.g, person’s name, living place, work place, details on work of the moment, links to friends. • define RDF predicates to represent them. • So: Pages: understood/ manipulated by computers. • So a db can be queried for: • “what projects are my friends working on?”, • “do any of my friends know the director of BigCorp?” etc..
A sample FOAF RDF document <rdf:RDFxmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:foaf="http://xmlns.com/foaf/0.1#"> <foaf:Person> <foaf:name>Matthew Yau</foaf:name> <foaf:mbox rdf:resource="mailto:C.Y.Yau@warwick.ac.uk"/> </foaf:Person> </rdf:RDF>
FOAF: another example <rdf:RDFxmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#" xmlns:foaf="http://xmlns.com/foaf/0.1/"> <foaf:Person> <foaf:name>Leigh Dodds</foaf:name> <foaf:firstName>Leigh</foaf:firstName> <foaf:surname>Dodds</foaf:surname> <foaf:mbox_sha1sum>71b88e951</foaf:mbox_sha1sum> <foaf:knows> <foaf:Person> <foaf:name>Dan Brickley</foaf:name> <foaf:mbox_sha1sum>241021fb0</foaf:mbox_sha1sum> <rdfs:seeAlsordf:resource="http://rdfweb.org/people/danbri/foaf.rdf"/> </foaf:Person> </foaf:knows> </foaf:Person> </rdf:RDF>
Extracting multiple results PREFIX b: <http://www2.warwick.ac.uk/rdf/> PREFIX bmod: <http://www2.warwick.ac.uk/fac/sci/dcs/teaching/material/> PREFIX foaf: http://xmlns.com/foaf/0.1/ SELECT ?module ?name WHERE { ?student b:studies ?module . ?student foaf:name ?name }
Using the same subject PREFIX b: <http://www2.warwick.ac.uk/rdf/> PREFIX bmod: <http://www2.warwick.ac.uk/fac/sci/dcs/teaching/material/ > PREFIX foaf: <http://xmlns.com/foaf/0.1/> SELECT ?module ?name WHERE { ?student b:studies ?module ; foaf:name ?name }
Abbreviating multiple objects SELECT ?module ?name WHERE { ?student b:studies ?module . ?student b:studies bmod:CS328 ; foaf:name ?name } is identical to: SELECT ?module ?name WHERE { ?student b:studies ?module , bmod:CS328 ; foaf:name ?name }
Optional graph components SELECT ?student ?email WHERE { ?student b:studies mod:CS328 . ?student foaf:mbox ?email } • PB: if a student does not have an e-mail address registered, with a foaf:mbox predicate, then the query will not match !!
Optional graph components SELECT ?student ?email WHERE { ?student b:studies mod:CS328 . OPTIONAL { ?student foaf:mbox ?email } } • OPTIONAL: match it if it can, but otherwise not reject the overall pattern.
More optional graph components …and filters: SELECT ?module ?name ?phone WHERE { ?student b:studies ?module . ?student foaf:name ?name . OPTIONAL { ?student b:contactpermission true . ?student b:phone ?phone } } SELECT ?module ?name ?age WHERE { ?student b:studies ?module . ?student foaf:name ?name . OPTIONAL { ?student b:age ?age . FILTER (?age > 25) } }
Further optional examples … SELECT ?student ?email ?home WHERE { ?student b:studies mod:CS328 . OPTIONAL { ?student foaf:mbox ?email . ?student foaf:homepage ?home } } SELECT ?student ?email?home WHERE { ?student b:studies mod:CS328 . OPTIONAL { ?student foaf:mbox ?email } . OPTIONAL { ?student foaf:homepage ?home } }
Combining matches SELECT ?student WHERE { ?student foaf:mbox ?email . { ?student b:studies mod:CS328 } UNION{ ?student b:studies mod:CS909 } } • When patterns are combined using the UNION keyword, the resulting combined pattern will match if any of the subpatterns is matched.
Multiple graphs and the dataset • All the previous queries: single RDF graphs. • Often: multiple RDF graphs. • RDF graphs: identified by IRI. • Note: IRI that represents the graph does not have to be the actual IRI of the graph file • although the program processing the query will need to somehow relate the IRI to an actual RDF graph stored somewhere.
Stating the dataset SELECT ?student ?email ?home FROM <http://www2.warwick.ac.uk/rdf/student> WHERE { ?student b:studies mod:CS909 . OPTIONAL { ?student foaf:mbox ?email . ?student foaf:homepage ?home } } By using several FROM declarations, you can combine several graphs in the dataset: SELECT ?student ?email ?home FROM <http://www2.warwick.ac.uk/rdf/student> FROM <http://www2.warwick.ac.uk/rdf/foaf> WHERE { ?student b:studies mod:CS909 . OPTIONAL { ?student foaf:mbox ?email . ?student foaf:homepage ?home } }
Multiple graphs • one (optional) default graph • + any number of named graphs. • FROM specifies default graph. • Many FROM keywords, the graphs are merged into the default graph. • Additionally: named graphs, w FROM NAMED. • However, to match patterns you must use GRAPH keyword to state which graph !
Examples of named graphs SELECT ?student ?email ?home FROM NAMED <http://www2.warwick.ac.uk/rdf/student> FROM NAMED <http://www2.warwick.ac.uk/rdf/foaf> WHERE { GRAPH <http://www2.warwick.ac.uk/rdf/student> { ?student b:studies mod:CS909 } . GRAPH <http://www2.warwick.ac.uk/rdf/foaf> { OPTIONAL { ?student foaf:mbox ?email . ?student foaf:homepage ?home } } }
Abbreviation using prefixes PREFIX brdf: <http://www2.warwick.ac.uk/rdf/> SELECT ?student ?email ?home FROM NAMED <http://www2.warwick.ac.uk/rdf/student> FROM NAMED <http://www2.warwick.ac.uk/rdf/foaf> WHERE { GRAPH brdf:student { ?student b:studies mod:CS909 } . GRAPH brdf:foaf{ OPTIONAL { ?student foaf:mbox ?email . ?student foaf:homepage ?home } } }
Using named and default graph together PREFIX brdf: <http://www2.warwick.ac.uk/rdf/> SELECT ?student ?email ?home FROM <http://www2.warwickac.uk/rdf/student> FROM NAMED <http://www2.warwickac.uk/rdf/foaf> WHERE { ?student b:studies mod:CS909 . GRAPH brdf:foaf{ OPTIONAL { ?studentfoaf:mbox?email . ?studentfoaf:homepage ?home } } }
Graph as a query • the GRAPH can also be a variable. • to query which graph in the dataset holds a particular relationship, • or which graph to search based on data in another graph. • not mandatory todeclare all graphs • even if specified, the dataset can be overridden on a per-query basis.
Which graph is it in? PREFIX brdf: <http://www2.warwick.ac.uk/rdf/> SELECT ?student ?graph WHERE { ?student b:studies mod:CS909 . GRAPH ?graph { ?student foaf:mbox ?email } } • output var graph holds graph URL (matching student to e-mail address). • Presumption: query processor has knowledge of a finite set of graphs + their locations
Re-using the graph reference PREFIX brdf: <http://www2.warwickac.uk/rdf/> SELECT ?student ?email WHERE { ?student b:studies mod:CS909 . ?student rdfs:seeAlso ?graph . GRAPH ?graph { ?student foaf:mbox ?email } } • Note: if student doesn’t have a rdfs:seeAlso property which points to a graph holding their e-mail address, they will not appear in the result at all.
Sorting results of a query SELECT ?name ?module WHERE { ?student b:studies ?module . ?student foaf:name ?name . } ORDER BY ?name SELECT ?name ?age WHERE { ?student b:age ?age . ?student foaf:name ?name . } ORDER BY DESC (?age) ASC (?name)
Limiting the number of results SELECT ?name ?module WHERE { ?student b:studies ?module . ?student foaf:name ?name . } LIMIT 20
Extracting subsets of the results SELECT ?name ?module WHERE { ?student b:studies ?module . ?student foaf:name ?name . } ORDER BY ?name OFFSET 10 LIMIT 20
Obtaining a Boolean result • Is any student studying any module? ASK { ?student b:studies ?module } • Is any student studying CS909? ASK { ?student b:studies bmod:CS909 } • Is student 029389 studying CS909? ASK { bstu:029389 b:studies bmod:CS909 } • Is anyone whom 029389 knows, studying CS909? ASK {bstu:029389 foaf:knows ?x . ?x b:studies bmod: CS909 } • Is any student aged over 30 studying CS909? ASK { ?student b:studies bmod:CS909 . ?student b:age ?age . FILTER { ?age > 30 } }
Obtaining unique results SELECT ?student WHERE { ?student b:studies ?module } versus: SELECT DISTINCT ?student WHERE { ?student b:studies ?module }
Constructing an RDF result CONSTRUCT { ?student b:studyFriend ?friend } WHERE { ?student b:studies ?module . ?student foaf:knows ?friend . ?friend b:studies ?module } } • If there is more than one search result, the triples from each result are combined.
Summary • SPARQL language is used for querying RDF. • SPARQL is not based on XML, but on SQL-like syntax. • Building blocks of SPARQL queries are graph patterns that include variables. The result of the query will be the values the vars must take to match the RDF graph. • A SPARQL query can return results in several different ways, as determined by the query. • SPARQL queries can also be used for OWL querying.
References • [1] Dean Allemang & Jim Hendler. 2008 Semantic Web for the working ontologist. Morgan Kaufmann publishers. ISBN 978-0-12-373556-0 • [2] Eric Prud'hommeaux ; & Andy Seaborne . 2008 SPARQL Query Language for RDF , [Online] http://www.w3.org/TR/rdf-sparql-query/
Tools to process • Download Jena from http://jena.sourceforge.net/ • Protégé 3.4 and above http://protege.stanford.edu/