1 / 34

SPARQLing Constraints for RDF

SPARQLing Constraints for RDF. Michael Schmidt EDBT, 2008 March 28 joint work with Prof. Georg Lausen, Michael Meier. SPARQLing Constraints for RDF. Extension of RDF by constraints With fixed semantics Integration into the Framework. RDF Data Format Machine-readable information

pwilmot
Download Presentation

SPARQLing Constraints for RDF

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SPARQLing Constraints for RDF Michael Schmidt EDBT, 2008 March 28 joint work with Prof. Georg Lausen, Michael Meier

  2. SPARQLing Constraints for RDF Extension of RDF by constraints • With fixed semantics • Integration into the Framework • RDF Data Format • Machine-readable information • Established in the Semantic Web • Constraints • Primary and foreign keys • Cardinality constraints, … bases on • SPARQL Query Language • Declarative Language • W3C Recommendation since Jan. The role of SPARQL in this context • Extracting constraints • Checking constraints • Optimization of SPARQL queries under constraints

  3. Why Constraints? • Restricting the state space of the database • Maintenance of data consistency (e.g. when data is updated) • Semantic Query Optimization • Better understanding of the data • Here: Translation of Relational Schemata to RDF without loss of information

  4. The RDF Data Format rdf:type Teachers • „Triples of Knowledge“ name name „Fred“ „Joe“ knows t1 t2 „43“ „CS“ age faculty (t1, name, „Joe“) , (t1, faculty, „CS“) , (t1, knows, t2)

  5. The RDF Data Format rdf:type Teachers name name „Fred“ „Joe“ knows t1 t2 „43“ „CS“ age faculty • Three elementary types • URIs (describe physical/logical entities & properties) • Literals (string values) • Blank Nodes (not conisdered)

  6. A Relational Data Scheme Teachers Students Courses Participants + NOT NULL constraints on each column

  7. Problem: Constraints only implicitly given! A Translation into RDF Teachers Students t2 s2 t1 s1 faculty faculty name name name name matric matric Joe “CS“ “CS“ Fred 11111 22222 “Ed“ “John“ “Web“ taught_by “DB“ s_id name name s_id c_id taught_by c_id c2 c1 p2 p1 rdf:type Courses Participants

  8. Constraints for RDF • Encoding in the schema layer • New namespace „rdfc“ provides constraint vocabulary with fixed semantics • rdfc:Key for primary keys • rdfc:FKey for foreign keys • rdfc:ref links foreign keys to primary keys • Use built-in RDF container class rdf:Seq

  9. Encoding Constraints Teachers rdfc:Key rdf:Seq rdfc:Key t2 t1 rdf:_1 faculty T_Key name faculty name name rdfc:ref Joe “CS“ Fred “CS“ rdfc:FKey rdf:Seq “Web“ “DB“ name name taught_by taught_by rdf:_1 C_FKey taught_by c2 c1 rdfc:FKey Courses

  10. Types of Constraints • Let C, C1, C2 be classes and Qi, Ri properties • Primary keys, foreign keys Key(C,[Q1,…Qn]), FKey(C1,[Q1,…Qn],C2,[R1,…Rn]) • Cardinality constraints Min(C,n,R), Max(C,n,R) for n N • Functionality constraints, totality constraints Func(C,Q), Total(C,Q) • and many more in the full paper: singleton, subclass, subproperty, property domain, property range

  11. Satisfiability Given an RDF vocabulary and a set of constraints. Is there a non-empty RDF graph that satisfies the constraints? in general undecidable • Shown by reduction from the key implication problem in Relational Databases • In the paper, we indicate • satisfiable constraint subclasses • decidable constraint subclasses

  12. The SPARQL Query Language • Declarative language • Bases upon graph patterns that are matched against the input graph • Different operators to combine these patterns • AND („.“) • OPTIONAL • UNION • FILTER SELECT ?name ?faculty ?title WHERE { ?teacher rdf:type Teachers. ?teacher name ?name. ?teacher faculty ?faculty. OPTIONAL { ?teacher title ?title. } }

  13. SPARQL Query Evaluation ?teacher „Professor“ Variables are matched against the input graph ?name Teachers title ?faculty t2 t1 ?title: unbound faculty faculty name name SELECT ?name ?faculty ?title WHERE { ?teacher rdf:type Teachers. ?teacher name ?name. ?teacher faculty ?faculty. OPTIONAL { ?teacher title ?title. } } Joe “CS“ Fred “CS“

  14. Extracting Key Constraints rdfc:Key rdf:Seq rdfc:Key rdf:_1 Teachers T_Key name SELECT ?keyname ?class ?keyatt WHERE { ?class rdfc:Key ?keyname. ?keyname rdf:type rdfc:Key. ?keyname ?seq ?keyatt. FILTER (?seq!=rdf:type) } … … • Extraction of foreign keys very similar

  15. Checking Constraints with SPARQL • Use SPARQL „ASK“ query form (returns „yes“ exactly if query contains a result, „no“ otherwise) • Constraint checks possible for many types constraints A SPARQL query checks a constraint C if it returns yes for each graph that violates C, no otherwise.

  16. Checking Constraints with SPARQL • Checking primary key constraints ASK { ?x rdf:type C. ?y rdf:type C. ?x p1 ?p1; [...]; pn ?pn. ?y p1 ?p1; [...]; pn ?pn. FILTER (?x!=?y) } Key(C,[p1,. . . ,pn]) Returns „yes“ exactly if constraint is violated. • Checking of foreign keys is a little more complicated, but also possible

  17. Semantic Query Optimization • Idea: use constraint knowledge to find a more efficient query execution plan • Has been studied in the context of relational and datalog databases… • … and now is applicable in the context of RDF and SPARQL

  18. Semantic Query Optimization SELECT ?teachername ?coursename ?studentname WHERE { ?course rdf:type Courses; taught_by ?teachername; name ?coursename. ?participant rdf:type Participants; c_id ?teachername; s_id ?studentmatric. ?teacher rdf:type Teachers; name ?teachername. OPTIONAL { ?student rdf:type Students; matric ?studentmatric; name ?studentname. } }

  19. A Solution Candidate Subgraph Teachers Students s1 t2 s2 t1 faculty faculty name name name name matric matric Joe “CS“ “CS“ Fred 11111 22222 “Ed“ “John“ “Web“ taught_by “DB“ s_id name name s_id c_id taught_by c_id c2 c1 p2 p1 Courses Participants

  20. Semantic Query Optimization SELECT ?teachername ?coursename ?studentname WHERE { ?course rdf:type Courses; taught_by ?teachername; name ?coursename. ?participant rdf:type Participants; c_id ?teachername; s_id ?studentmatric. ?teacher rdf:type Teachers; name ?teachername. OPTIONAL { ?student rdf:type Students; matric ?studentmatric; name ?studentname. } } FKey(Participants, [s_id], Students, [matric]) Key(Students,[matric]) Total(Students,[name])

  21. Semantic Query Optimization SELECT ?teachername ?coursename ?studentname WHERE { ?course rdf:type Courses; taught_by ?teachername; name ?coursename. ?participant rdf:type Participants; c_id ?teachername; s_id ?studentmatric. ?teacher rdf:type Teachers; name ?teachername. ?student rdf:type Students; matric ?studentmatric; name ?studentname. } FKey(Courses, taught_by, Teacher, [name]) Key(Teacher, [name])

  22. Semantic Query Optimization SELECT ?teachername ?coursename ?studentname WHERE { ?course rdf:type Courses; taught_by ?teachername; name ?coursename. ?participant rdf:type Participants; c_id ?teachername; s_id ?studentmatric. ?student rdf:type Students; matric ?studentmatric; name ?studentname. } • Many more optimizations possible • Rewriting of filter expressions • Elimination of redundant rdf:type specifications

  23. Future Work • Study of other types of constraints and the interaction between constraints • Development of a schematic approach to Semantic Query Optimization • Mapping to SQL/Datalog? • SPARQL-specific semantic optimizations? • Efficient constraint checking algorithms

  24. Thank you for your attention! • Recourse Description Framework (RDF): Concepts and Abstract Syntax. http://www.w3.org/TR/rdf-schema/. W3C Recommendation, February 10, 2004. • RDF Vocabulary Description Language 1.0: RDF Schema. • http://www.w3.org/TR/rdf-schema/. W3C Recommendation, Febuary 10, 2004. • RDF Semantics. • http://www.w3.org/TR/rdf-mt/. W3C Recommendation, February 10, 2004. • S.T. Shenoy and Z.M. Ozsoyoglu. A System for Semantic Query Optimization. In SIGMOD, pages 181-195, 1987. • SPAQL Query Language for RDF. http://www.w3.org/TR/rdf-sparql-query/. W3C Proposed Recommendation, November 12, 2007. • G.E. Weddell. A Theory of Functional Dependencies for Object-Oriented Data Models. In DOOD, pages 165-184, 1989. • C. Bizer.D2R MAP-A Database to RDF Mapping Language. In WWW (Posters), 2003. • C.Bizer, R.Cyganiak, J. Garbers, and O. Maresch. D2RQ: Treading Non-RDF Relational Databases as Virtual RDF Graphs. User Manual and Language Specification. • J. J. King. QUIST: A System for Semantic Query Optimization in Relational Databases. Distributed systems, Vol. II, pages 287-294, 1986. • G. Lausen. Relational Databases in RDF. In Joint ODBIS & SWDB Workshop on Semantic Web, Ontologies, Databases, 2007. • B. Motik, I. Horrocks, and U. Sattler. Bridging the Gap Between OWL and Relational Databases, In WWW, pages 807-816, 2007. • J. Pérez, M. Arenas, and C. Gutierrez. Semantics and Complexity of SPARQL. In CoRR Technical Report cs.DB/0605124, 2006.

  25. Additional Resources

  26. Checking Constraints with SPARQL • Checking foreign key constraints FKey(C,[p1,. . . ,pn],D,[q1,... qn]) Bind objects of type C, with properties bound to ?p1, …, ?pn ASK { ?x rdf:type C; p1 ?p1; [...]; pn ?pn. OPTIONAL { ?y rdf:type D; q1 ?p1; [...]; qn ?pn. } FILTER (!bound(?y)) } Only keep results for which no referenced object exists Bind the (referenced) object to variable ?y, if any

  27. RDFS Constraints • Let Ci denote classes, Qi denote properties • Subclass Constraint SubC(C1,C2) • Subproperty Constraint SubP(Q1,Q2) • Property Domain/Range PropD(Q,C), PropR(Q,C) • Restrict the state space of the database • No „axioms“ that are used for inferencing

  28. Satisfiability Given an RDF vocabulary and a set of constraints. Is there a non-empty RDF graph that satisfies the constraints? in general undecidable • Primary keys + Foreign Keys • Singleton • Max-Cardinality • Subclass + Subproperty • Property Domain + Property Range always satisfiable

  29. Satisfiability Given an RDF vocabulary and a set of constraints. Is there a non-empty RDF graph that satisfies the constraints? in general undecidable • Primary keys + Foreign Keys • Singleton • Max-Cardinality • Subclass + Subproperty • Property Domain + Property Range • Min-Cardinality undecidable

  30. Satisfiability Given an RDF vocabulary and a set of constraints. Is there a non-empty RDF graph that satisfies the constraints? in general undecidable • Unary primary keys • Unary foreign keys • Min-Cardinality + Max-Cardinality • Subclass + Subproperty • Property Domain + Property Range decidable in ExpTime

  31. Teachers t2 t1 faculty faculty name name Joe “CS“ Fred “CS“ The SPARQL Query Language Operator AND („.“) SELECT ?name ?faculty WHERE { ?teacher rdf:type Teachers. ?teacher name ?name. ?teacher faculty ?faculty. }

  32. Teachers t2 t1 faculty faculty name name Joe “CS“ Fred “CS“ The SPARQL Query Language Operator UNION SELECT ?name ?faculty WHERE { { ?teacher rdf:type Teachers. ?teacher name ?name. ?teacher faculty ?faculty. FILTER (?name=„Joe“). } UNION { ?teacher rdf:type Teachers. ?teacher name ?name. ?teacher faculty ?faculty. FILTER (?name=„Fred“). } }

  33. Teachers t2 t1 faculty faculty name name Joe “CS“ Fred “CS“ The SPARQL Query Language Operator FILTER SELECT ?name ?faculty WHERE { ?teacher rdf:type Teachers. ?teacher name ?name. ?teacher faculty ?faculty. FILTER (?name=„Joe“) }

  34. Teachers t2 t1 faculty faculty name name Joe “CS“ Fred “CS“ The SPARQL Query Language Operator OPTIONAL SELECT ?name ?faculty ?title WHERE { ?teacher rdf:type Teachers. ?teacher name ?name. ?teacher faculty ?faculty. OPTIONAL { ?teacher title ?title. } } „Professor“ title

More Related