400 likes | 516 Views
gStore: Answering SPARQL Queries Via Subgraph Matching. 1 Peking University, 2 Hong Kong University of Science and Technology, 3 University of Waterloo. Lei Zou 1 , Jinghui Mo 1 , Lei Chen 2 , M. Tamer Özsu 3 , Dongyan Zhao 1. Outline. Background & Related Work Overview of gStore
E N D
gStore: Answering SPARQL Queries Via Subgraph Matching 1Peking University, 2Hong Kong University of Science and Technology, 3University of Waterloo Lei Zou1, Jinghui Mo1, Lei Chen2, M. Tamer Özsu3, Dongyan Zhao1
Outline Background & Related Work Overview of gStore Encoding Technique VS*-tree & Query Algorithm Experiments Conclusions
Outline Background & Related Work Overview of gStore Encoding Technique VS*-tree & Query Algorithm Experiments Conclusions
Semantic Web “Semantic Web Technologies” is a collection of standard technologies to realize a Web of Data.
RDF Data Model URI Literals URI
RDF Graph Literal Vertex Entity Vertex
SPARQL Queries SPARQL Query: Select ?name Where { ?m <hasName> ?name. ?m <BornOnDate> “1809-02-12”. ?m <DiedOnDate> “1865-04-15”. } Query Graph
Naïve Triple Store SPARQL Query: Select ?name Where { ?m <hasName> ?name. ?m <BornOnDate> “1809-02-12”. ?m <DiedOnDate> “1865-04-15”. } Too many Self-Joins SQL: Select T3.Subject From T as T1, T as T2, T as T3 Where T1.Predict=“BornOnDate” and T1.Object=“1809-02-12” and T2.Predict=“DiedOnDate” and T2.Object=“1865-04-15” and T3. Predict=“hasName” and T1.Subject = T2.Subject and T2. Subject= T3.subject
Existing Solutions Three categories of solutions are proposed to speed up query processing: Property Table; Jena [K. Wilkinson et al. SWDB 03], … 2. Vertically Partitioned Solution; SW-store [D. J. Abadi et al. VLDB 07],… 3. Exhaustive-IndexingRDF-3x [T. Neumann et al. VLDB 08], Hexastore [C. Weiss et al. VLDB 08 ],…
Existing Solutions-Property Table SPARQL Query: Select ?name Where { ?m <hasName> ?name. ?m <BornOnDate> “1809-02-12”. ?m <DiedOnDate> “1865-04-15”. } Reducing # of join steps SQL: Select People.hasName from People where People.BornOnDate = “1809-02-12” and People.DiedOnDate = “1865-04-15”.
Existing Solutions-Vertically Partitioned Solution Fast Merge Join
Existing Solutions- Exhaustive-Indexing Range query & Merge Join Each SPARQL query statement can be translated into one “range query”. SPARQL Query: Select ?name Where { ?m <hasName> ?name. ?m <BornOnDate> “1809-02-12”. ?m <DiedOnDate> “1865-04-15”. }
Some Limitations Difficult to handle ``wildcard queries’’. Difficult to handle updates.
Outline Background & Related Work Overview of gStore Encoding Technique VS*-tree & Query Algorithm Experiments Conclusions
Intuition of gStore Finding Matches over a Large Graph is not a trivial task.
Preliminaries Literal Vertex Entity Vertex
Storage Schema in gStore Encoding all neibhors into a “bit-string”, called signature.
Encoding Technique (1) “Abr”, “bra”, ”rah”, ”aha”, …., 0000 0010 0000 0000 ( hasName, “Abraham Lincoln”) 1000 0000 0000 0000 0010 0000 0000 1000 0010 0100 0001 0000 0000 0100 0000 ( BornOnDate, “1809-02-12”) 0100 0000 0000 0100 0010 0100 1000 0000 0000 0000 0001 OR ( DiedOnDate, “1865-04-15”) 1000 0010 0100 0001 0000 1000 0000 0000 0010 0100 0000 OR ( DiedIn, “y:Washington_D.c”) 0000 0010 0000 1100 0010 0100 1001 0000 0010 0000 1000 0010 0100 0001
Outline Background & Related Work Overview of gStore Encoding Technique VS-tree & Query Algorithm Experiments Conclusions
A Straightforward Solution (1) u2 u1 L1 L2
A Straightforward Solution (2) L1 L2 Large Join Space !
Pruning Technique Reduced Join Space! u2 u1 10010
An Example for Pruning Effect Query: ?x1 y:hasGivenName ?x5 ?x1 y:hasFamilyName ?x6 ?x1 rdf:type <wordnet_scientist_110560637> ?x1 y:bornIn ?x2 ?x1 y:hasAcademicAdvisor ?x4 ?x2 y:locatedIn <Switzerland> ?x3 y:locatedIn <Germany> ?x4 y:bornIn ?x3
Outline Background & Related Work Overview of gStore Encoding Technique VS*-tree & Query Algorithm Experiments Conclusions
Outline Background & Related Work Overview of gStore Encoding Technique VS*-tree & Query Algorithm Experiments Conclusions
Conclusions Vertex Encoding Technique; An Efficient index Structure: VS-tree; A Novel Filtering Technique.
Q/A Thank You! zoulei@pku.edu.cn
Updates- Deletion in VS*-tree To be deleted
A Straightforward Solution (1) u u & 001 = u