460 likes | 570 Views
CS 253: Topics in Database Systems: C4. Dr. Alexandra I. Cristea http://www.dcs.warwick.ac.uk/~acristea/. Previously we looked at: XML, and its query language(s) RDF Next: RDF query languages. RDF query languages. Proposals. SPARQL http://www.w3.org/TR/rdf-sparql-query/ RDQL
E N D
CS 253: Topics in Database Systems: C4 Dr. Alexandra I. Cristea http://www.dcs.warwick.ac.uk/~acristea/
Previously we looked at: • XML, and its query language(s) • RDF • Next: • RDF query languages
Proposals • SPARQL • http://www.w3.org/TR/rdf-sparql-query/ • RDQL • http://www.w3.org/Submission/RDQL/ • RQL • http://139.91.183.30:9090/RDF/RQL/ • SeRQL • http://www.openrdf.org/doc/sesame/users/ch06.html • Triple: • http://triple.semanticweb.org/ • N3: • http://www.w3.org/DesignIssues/Notation3 • Comparison of languages: • http://www.aifb.uni-karlsruhe.de/WBS/pha/rdf-query/rdfquery.pdf
Introduction SeRQL • "Sesame RDF Query Language", pronounced "circle“ • new RDF/RDFS query language • currently being developed by Aduna as part of Sesame. http://www.openrdf.org/ • It combines (best?) features of other (query) languages (RQL, RDQL, N-Triples, N3) and adds some of its own.
Sesame • open source RDF framework with support for RDF Schema inferencing and querying. • Originally, it was developed by Aduna (then known as Aidministrator) as a research prototype for the EU research project On-To-Knowledge. • further developed and maintained by Aduna in cooperation with NLnet Foundation, developers from Ontotext, and a number of volunteer developers
SeRQL's features • Graph transformation. • RDF Schema support. • XML Schema datatype support. • Expressive path expression syntax. • Optional path matching.
SeRQL basic building blocks • RDF: • URIs, • literals and • variables URIs and literals • variables
Variables • identified by names. • must start with a letter or an underscore ('_') and can be followed by zero or more letters, numbers, underscores, dashes ('-') or dots ('.'). • Examples: Var1 _var2 unwise.var-name_isnt-it • SeRQL keywords are not allowed to be used as variable names.
(reserved) Keywords • Currently: select, construct, from, where, using, namespace, true, false, not, and, or, like, label, lang, datatype, null, isresource, isliteral, sort, in, union, intersect, minus, exists, forall, distinct, limit, offset. • case-insensitive, (unlike variable names).
URIs • full URIs • abbreviated URIs (QNames)
Full URIs • must be surrounded with "<" and ">". • Tend to be long (!!) • Examples: <http://www.openrdf.org/index.html> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <mailto:sesame@openrdf.org> <file:///C:\rdffiles\test.rdf>
Abbreviated URIs (QNames) • Components: defined prefix (for the namespace) and a colon (“:”), then the URI part that is not a namespace • Examples: sesame:index.html rdf:type foaf:Person
label Language tag datatype Literals • Parts: • “label”, • language tag, and • datatype • Examples: "foo" "foo"@en "<foo/>"^^http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral "<foo/>"^^rdf:XMLLiteral Optional, mutually exclusive
Blank nodes • RDF has a notion of blank nodes. • nodes that are not labelled with a URI or literal. • Interpretation (): "there exists a node such that..." • Blank nodes have internal identifiers • Shortcut in SeRQL: _:bnode1 • Attention: problem of non-portability!!!
Path expressions • expressions that match specific paths through an RDF graph • usually, triples = path expressions of length 1 • in SeRQL: arbitrary length
Basic path expressions • Query: • persons who work for (companies that are) IT companies.
Original (possible) RDF: <?xml version=“1.0”?> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:foo="http://www.mycompany.smthg/company#"> <rdf:Description about=“http://www.mycompany.smthg/company/Person”> <foo:worksFor> rdf:resource=“http://www.mycompany.smthg/company/Company” </foo:worksFor> </rdf:Description> <rdf:Description about=“http://www.mycompany.smthg/company/Company”> <rdf:type>rdf:resource=“http://www.mycompany.smthg/company/CompanySchema#ITCompany”</rdf:type> </rdf:Description> </rdf:RDF>
Basic path expressions • Query: • persons who work for (companies that are) IT companies. {Person} foo:worksFor {Company} rdf:type {foo:ITCompany} <foo:worksFor> <rdf:type> Company Person <foo:ITCompany> Triple (length =1):
Multiple Path Expressions • Separated with commas • Example: {Person} ex:worksFor {Company}, {Company} rdf:type {ex:ITCompany}
Non-interesting nodes • Can be left empty • Examples: {Person} ex:worksFor {} rdf:type {ex:ITCompany} {Painting} ex:painted_by {} ex:name {"Picasso"}
Path expression short cuts • Multi-value nodes • Branches • Reified statements
Multi-valued nodes • Multiple objects: {subj1} pred1 {obj1, obj2, obj3} • Multiple subjects: {subj1, subj2, subj3} pred1 {obj1} • Condition: disjoint !!
Branches {subj1} pred1 {obj1}; pred2 {obj2} Equivalent to: {subj1} pred1 {obj1}, {subj1} pred2 {obj2}
Reified statements • { {reifSubj} reifPred {reifObj} } pred {obj} • Equivalent to: • {_Statement} rdf:type {rdf:Statement}, {_Statement} rdf:subject {reifSubj}, {_Statement} rdf:predicate {reifPred}, {_Statement} rdf:object {reifObj}, {_Statement} pred {obj}
Optional Path Expressions {Person} ex:name {Name}; ex:age {Age}; [ex:email {EmailAddress}]
Queries in SeRQL • Select queries: • returning a table of values, or a set of variable-value bindings. • SELECT, FROM, WHERE, LIMIT, OFFSET and USING NAMESPACE • Construct queries: • returns a true RDF graph • CONSTRUCT, FROM, WHERE, LIMIT, OFFSET and USING NAMESPACE
Select queries • SELECT C FROM {C} rdf:type {rdfs:Class} • returns all URIs of classes • SELECT DISTINCT * FROM {Country1} ex:borders {} ex:borders {Country2} USING NAMESPACE ex =<http://example.org/things#>
Construct queries • CONSTRUCT {Parent} ex:hasChild {Child} FROM {Child} ex:hasParent {Parent} USING NAMESPACE ex = <http://example.org/things#> • CONSTRUCT * FROM {SUB} rdfs:subClassOf {SUPER} • This query extracts all rdfs:subClassOf relations from an RDF graph.
WHERE clause • Optional; • Specifies Boolean constraints SELECT Country FROM {Country} ex:population {Population} WHERE Population < "1000000"^^xsd:positiveInteger USING NAMESPACE ex = <http://example.org/things#>
Nested WHERE clauses • Query 1 (normal WHERE-clause): SELECT Name, EmailAddress FROM {Person} foaf:name {Name}; [ex:email {EmailAddress}] WHERE EmailAddress LIKE "g*" • Query 2 (nested WHERE-clause): SELECT Name, EmailAddress FROM {Person} foaf:name {Name}; [ex:email {EmailAddress} WHERE EmailAddress LIKE "g*"] • at most one nested WHERE-clause per optional path expression, and at most one 'normal' WHERE-clause
Results WHERE queries • Query 1 Name EmailAddress Giancarlo giancarlo@example.work • Query 2 (nested WHERE) Name EmailAddress Michael Rubens Giancarlo "giancarlo@example.work"
LIKE operator SELECT Country FROM {Country} ex:name {Name} WHERE Name LIKE “netherlands" IGNORE CASE USING NAMESPACE ex = <http://example.org/things#>
Built-in predicates • {X} serql:directSubClassOf {Y} • {X} serql:directSubPropertyOf {Y} • {X} serql:directType {Y}
Set combinatory operations • Union • Intersect • Minus
Union SELECT title FROM {book} dc10:title {title} UNION SELECT title FROM {book} dc11:title {title} USING NAMESPACE dc10 = <http://purl.org/dc/elements/1.0/>, dc11 = <http://purl.org/dc/elements/1.1/>
Intersect SELECT creator FROM {album} dc10:creator {creator} INTERSECT SELECT creator FROM {album} dc11:creator {creator} USING NAMESPACE dc10 = <http://purl.org/dc/elements/1.0/>, dc11 = <http://purl.org/dc/elements/1.1/>
Minus (difference) SELECT title FROM {album} dc10:title {title} MINUS SELECT title FROM {album} dc10:title {title}; dc10:creator {creator} WHERE creator like "Paul" USING NAMESPACE dc10 = <http://purl.org/dc/elements/1.0/>, dc11 = <http://purl.org/dc/elements/1.1/>
NULL values • SELECT * • FROM {X} Y {Z} • WHERE isLiteral(Z) AND datatype(L) = NULL • to check that a literal doesn't have a datatype;
Query Nesting • IN • ANY, ALL • EXISTS
IN SELECT name FROM {} rdf:type {ex:Person}; ex:name {name} WHERE name IN ( SELECT n FROM {} rdf:type {ex:Author}; ex:name {n} ) USING NAMESPACE ex = http://example.org/things# • retrieve all names of Persons, but only those names that also appear as names of Authors.
ANY, ALL SELECT highestValue FROM {node} ex:value {highestValue} WHERE highestValue >= ALL ( SELECT value FROM {} ex:value {value} ) USING NAMESPACE ex = <http://example.org/things#>
EXISTS SELECT name, hobby FROM {} rdf:type {ex:Person}; ex:name {name}; ex:hobby {hobby} WHERE EXISTS ( SELECT n FROM {} rdf:type {ex:Author}; ex:name {n}; ex:authorOf {} WHERE n = name ) USING NAMESPACE ex = <http://example.org/things#>
RDF Query Languages Conclusion • We have learned: • There is a high competition for providing The RDF query language • No standards as yet • We have looked in more details at one of them, SeRQL, as it is an implementers’ language paired with an existing RDF repository tool, Sesame • Many features in SeRQL remind us of SQL, thus learning threshold should be low
Next: • OWL