1 / 29

SPARQL Query Rewriting for Implementing Data Integration over Linked Data

SPARQL Query Rewriting for Implementing Data Integration over Linked Data. Gianluca Correndo, Manuel Salvadores, Ian Millard, Hugh Glaser, Nigel Shadbolt. Linked Data access. Retrieving RDF content via HTTP requests Instance based vs. schema based access Accessing SPARQL endpoints

calida
Download Presentation

SPARQL Query Rewriting for Implementing Data Integration over Linked Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SPARQL Query Rewriting for Implementing Data Integration over Linked Data Gianluca Correndo, Manuel Salvadores, Ian Millard, Hugh Glaser, Nigel Shadbolt

  2. Linked Data access • Retrieving RDF content via HTTP requests • Instance based vs. schema based access • Accessing SPARQL endpoints • Schema based vs. instance based access SPARQL+HTTP

  3. Linked Data – Schema based integration (SPARQL) Query Ontology Co-reference Data set source target OA = <SO,TO,TD,EA> SO: Source Ontologies TO: Target Ontologies TD: Target Dataset EA: Entity Alignments • Datasets can use more than one ontology for describing the data • More than one dataset can use the same set of ontologies coherently (e.g. RKB) • More than one ontology is used for defining a SPARQL query • Ontologies contain many entities to be aligned

  4. Query Rewriting Architecture <source> SPARQL query SPARQL query rewriter <target> SPARQL query <KISTI> SPARQL query <dbpedia> SPARQL query Alignments voiD

  5. Ontology Alignment • DL primitives are used to describe concept alignments (i.e. Equivalent, Subsume) • Implementation of the underneath ontological mediation usually not provided or relies on reasoners • Ontological mediation usually applied to data, not queries • rule systems that exploit alignments to translate data • [Euzenat] SPARQL for integrating dataCONSTRUCT { ?x rdf:type vc:VCard } WHERE { ?x rdf:type foaf:Person }How to write such queries?

  6. Anatomy of a SPARQL query • Query type: SELECT, DESCRIBE, CONSTRUCT, ASK • Basic Graph Pattern (or BGP): graph pattern that resulting triples must satisfy • Filter section: additional constraints over variables present in the BGPPREFIX id:<http://southampton.rkbexplorer.com/id/>PREFIX akt:<http://www.aktors.org/ontology/portal#>SELECT DISTINCT ?a WHERE { ?paper akt:has-author id:person-02686 . ?paper akt:has-author ?a .}

  7. SPARQL BGP ?paper PREFIX id:<http://southampton.rkbexplorer.com/id/>PREFIX akt:<http://www.aktors.org/ontology/portal#>SELECT DISTINCT ?a WHERE { ?paper akt:has-author id:person-02686 , ?a .} • “DISTINCT ?a” is not represented in this graph • Constraints over nodes can be represented either as a graph and within FILTER section akt:has-author akt:has-author ?a id:person-02686

  8. Entity Alignment as Graph Rewriting • Query rewriting based on BGP graph rewriting • Entity Alignment EA = <LHS, RHS, FD> • LHS : Triple to match (open variables to bind) • RHS : Set of triples to instantiate (depending on previous bindings on open variables) • FD : Functional dependencies (between variables)

  9. Entity Alignment as Graph Rewriting • Using the graph rewriting formalism we can rewrite queries defined for a dataset (or ontology) to integrate results from other data sets • But not only, we can also generate CONSTRUCT queries to integrate entire data sets

  10. SPARQL Rewriting • Each triple from the BGP is matched to the LHSs (generating variable bindings in the process) • Eventual functional dependencies are solved (enriching the bindings with new associations) • The respective RHS is instantiated with the given bindings and replace the original triple • Unbounded variables generates new variables

  11. SPARQL Rewriting • Example: • LHS1 = <_:1,rdf:type, source:A> • RHS1 = {<_:1,rdf:type,target:B>} • FD1 = {} • <?p,rdf:type,source:A> = LHS1[_:1/?p] • RHS1[_:1/?p]=<?p,rdf:type,target:B> • _:1 it’s the RDF way to define blank nodes, that are treated, within a graph, as existentially quantified variables.Triple(v1,rdf:type,source:A)Triple(v1,rdf:type,target:B)

  12. Ontology Alignments – Class Eq. • SELECT *WHERE { ?s a source:User.…}<_:1,rdf:type,source:User> • SELECT *WHERE { ?s a target:Agent.…}<_:1,rdf:type,target:Agent> source:User target:Agent rdf:type rdf:type _:1 _:1

  13. Ontology Alignments – Class Partition • SELECT *WHERE { ?s a source:WhiteWine.…}<_:1,rdf:type,source:WhiteWine> • SELECT *WHERE { ?s a target:Vin; target:has-color ”blanc”@fr…}<_:1,rdf:type,target:Vin><_:1,target:has-color, ”blanc”@fr> source:WhiteWine target:Vin rdf:type rdf:type _:1 target:has-color _:1 “blanc”@fr

  14. Ontology Alignments – Property Eq. • SELECT *WHERE { ?s source:has-name ?n.…}<_:1,source:has-name,_:2> • SELECT *WHERE { ?s target:fullName ?n.…}<_:1,target:fullName,_:2> _:2 _:2 source:has-name target:fullName _:1 _:1

  15. Ontology Alignments – Property Eq. • SELECT *WHERE { ?p akt:has-author ?a.…}<_:1,akt:has-author,_:2> • SELECT *WHERE { ?s kisti:CreatorInfo ?i. ?i kisti:hasCreator ?a…}<_:1,kisti:CreatorInfo,:_3><_:3,kisti:hasCreator,_:2> _:2 _:3 akt:has-author kisti:CreatorInfo kisti:hasCreator _:1 _:1 _:2

  16. Ontology Alignments – Property Eq. • SELECT *WHERE { ?p source:temp ”10”^^C.…}<_:1,source:temp,_:2> • SELECT *WHERE { ?p target:farenheit ”50”^^F…}<_:1,target:farenheit,_:2> celsius2farenheit _:2 _:3 _:2 source:temp target:farenheit _:1 _:1 binding directly Celsius values to Fahrenheit is wrong, the two values are linked by a functional dependency.

  17. SPARQL Rewriting • PREFIX id:<http://southampton.rkbexplorer.com/id/>PREFIX akt:<http://www.aktors.org/ontology/portal#>SELECT DISTINCT ?a WHERE { ?paper akt:has-author id:person-02686 . ?paper akt:has-author ?a .} ?paper akt:has-author akt:has-author ?a id:person-02686 _:2 _:3 akt:has-author kisti:CreatorInfo kisti:hasCreator _:1 _:1 _:2

  18. SPARQL Rewriting ?paper kisti:CreatorInfo akt:has-author ?paper ?new1 id:person-02686 akt:has-author akt:has-author kisti:hasCreator ?a ?a id:person-02686 Problem in KISTI dataset <http://southampton.rkbexplorer.com/id/person-02686> is unknown. ?paper kisti:CreatorInfo kisti:CreatorInfo ?new2 ?new1 kisti:hasCreator kisti:hasCreator ?a id:person-02686

  19. Co-reference integration • Constants in the query (like URIs) must be translated in order to retrieve correct results • URI equivalences are maintained by co-reference services like http://sameas.orgaccessible via REST interface. • Modeled as functional dependency within variables • Function returns the equivalent URI that satisfy a regex pattern • Datasets maintain URIs that are recognizable by a common schema (prefix for sure, e.g. http://dbpedia.org/resource/*)

  20. Co-reference integration http://kisti.rkbexplorer.com/id/\S* sameas _:11 _:12 kisti:CreatorInfo akt:has-author _:3 kisti:hasCreator sameas _:21 _:22 id:person-02686 kisti:PER_000000000105047

  21. Implementation • Java package based on Jena API for SPARQL Query rewriting • Code not released yet (planning to integrate it with INRIA ontology alignment API)

  22. Progress report • Contact with Francois Schraffe and Jerome Euzenat • Partial mapping to EDOAL ontology alignment specification (work in progress) • SPARQL query rewriter to be implemented in the Alignment API (partially done)

  23. EDOAL - Expressive and Declarative Ontology Alignment Language • Construction of entities from other entities can be expressed through algebraic operators • Restrictions can be expressed on entities in order to narrow their scope. • Transformations of property values can be specified. Property values using different encoding or units can be aligned using transformations.

  24. EDOAL - Example <http://oms.omwg.org/wine-vin/MappingRule_3> :entity1 wine:Bordeaux ; :entity2 [ edoal:and (vin:Vin [ a edoal:AttributeValueRestriction edoal:comparatorxsd:equals ; edoal:onAttribute [ edoal:compose (vin:hasTerroir proton:locatedIn ) ; a edoal:Relation ] ; edoal:valuevin:Aquitaine] ) ; a edoal:Class ] ; :measure "1."^^xsd:float ; :relation "SubsumedBy" ; a :Cell .

  25. Internal Representation vin:Vin rdf:type wine:Bordeaux vin:hasTerroir rdf:type _:6 _:9 proton:locatedIn _:6 vin:Aquitaine

  26. Progress report • Graph pattern rewriting can be used also for creating CONSTRUCT queries for translate RDF graphs with different ontologies. CONSTRUCT { ?9 <http://proton.semanticweb.org/locatedIn> <http://ontology.deri.org/vin#Aquitaine> . ?6 <http://ontology.deri.org/vin#hasTerroir> ?9 . ?6 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://ontology.deri.org/vin#Vin> .} WHERE { ?6 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/TR/2003/CR-owl-guide-20030818/wine#Bordeaux> .}

  27. Thanks Questions?

  28. Outline • Linked Data • Data topology • Data access • Query Rewriting • Ontology Alignment • Entity Alignment • SPARQL rewriting

  29. Linked Data topology • Foreign URIs for referring to external entities • Co-references for referring to instance “equivalence”

More Related