280 likes | 417 Views
Comparison of Triple Stores. Omar Aaziz. Contents . Introduction RDF Ontology Different Architectures An Example : Jena SDB Evaluations Evaluations using LUBM/DBPedia Open Research Issues Which RDF Store to choose for a particular application?. Introduction. What is an RDF store?
E N D
Comparison of Triple Stores Omar Aaziz
Contents • Introduction • RDF • Ontology • Different Architectures • An Example : Jena SDB • Evaluations • Evaluations using LUBM/DBPedia • Open Research Issues • Which RDF Store to choose for a particular application?
Introduction • What is an RDF store? A system to provide a mechanism for persistent storage and access of RDF graphs.
Resource Description Framework • A framework (not a language) for describing resources • Model for data • Syntax to allow exchange and use of information stored in various locations • The point is to facilitate reading and correct use of information by computers, not necessarily by people
Identification and description • RDF identifies resources with URIs • Often, though not always, the same as a URL • Anything that can have a URI is a RESOURCE • RDF describes resources with properties and property values • A property is a resource that has a name • Ex. Author, Book, Address, Client, Product • A property value is the value of the Property • Ex. “Joanna Santillo,” http://www.someplace.com/, etc. • A property value can be another resource, allowing nested descriptions.
Statements • subject, predicate, object of a statement • Predicates are not the same as English language verbs. • Specify a relationship between the subject and the object
Statement: "The author of http://www.w3schools.com/RDF is Jan Egil Refsnes". Subject: http://www.w3schools.com/RDF Predicate: author Object: Jan Egil Refsnes Example
RDF offers only binary predicates. Think of them as P(x,y) where P is the relationship between the objects x and y. From the example, X = http://www.w3schools.com/RDF Y = Jan Egil Refsnes P = author Binary predicates http://www.w3schools.com/RDF Jan Egil Refsnes author
<?xml version="1.0"?><rdf:RDFxmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:cd="http://www.recshop.fake/cd#"> <rdf:Description rdf:about="http://www.recshop.fake/cd/Empire Burlesque"> <cd:artist>Bob Dylan</cd:artist> <cd:country>USA</cd:country> <cd:company>Columbia</cd:company> <cd:price>10.90</cd:price> <cd:year>1985</cd:year></rdf:Description><rdf:Description rdf:about="http://www.recshop.fake/cd/Hide your heart"> <cd:artist>Bonnie Tyler</cd:artist> <cd:country>UK</cd:country> <cd:company>CBS Records</cd:company> <cd:price>9.90</cd:price> <cd:year>1988</cd:year></rdf:Description>… </rdf:RDF>. Root element of RDF documents Source of namespace for elements with rdf prefix Source of namespace for elements with cd prefix Description element describes the resource identified by the rdf:about attribute. Cd:country etc are properties of the resource.
RDF validator • Check the correctness of an RDF document: • http://www.w3.org/RDF/Validator/ • Result shows the subject, predicate and object of each element of the document and a graph of the model.
OWL • Web Ontology Language • Official W3C Standard since Feb 2004 • A Web Language: Based on RDF(S) • An Ontology Language: Based on logic
OWL Ontologies • What’s inside an OWL ontology • Classes + class-hierarchy • Properties (Slots) / values • Relations between classes(inheritance) • Restrictions on properties (type, cardinality)
OWL Use Cases • At least two different user groups • OWL used for terminologies or knowledge models • OWL used as data exchange language(define interfaces of services and agents)
City Beach Cairns Sydney BondiBeach CurrawongBeach Classes • Sets of individuals with common characteristics • Individuals are instances of at least one class
hasPart hasAccomodation Sydney BondiBeach FourSeasons ObjectProperties • Link two individuals together • Relationships (0..n, n..m)
Different Architectures • Based on their implementation, can be divided into 3 broad categories : In-memory, Native, Non-native Non-memory. • In – Memory : RDF Graph is stored as triples in main –memory. Eg. Storing an RDF graph using Jena API/ Sesame API. • Native : Persistent storage systems with their own implementation of databases. Eg. Sesame Native, Virtuoso, AllegroGraph, Oracle 11g. • Non-Native Non-Memory : Persistent storage systems set-up to run on third party DBs. Eg. Jena SDB.
Jena SDB • SDB is a persistent triple store using relational databases. • SDB basically is a Java Loader. • Multiple stores supported: MySQL, PostgreSQL, Oracle, DB2. • Takes incoming triples and breaks them down into components ready for the database. • SPARQL supported. (Non) Interest Declaration: I was previously an intern at HP Labs with the Jena team
Evaluations • Third party evaluations for Sesame, Jena SDB, Virtuoso • Oracle 11g company evaluations • Methodology • LUBM – Lehigh University BenchMark • DBPedia - http://wiki.dbpedia.org/Ontology?show_comments=1 • Multiple Queries • Load Times
Evaluations • DB Pedia – Database of structured information extracted from Wikipedia. Information about places, persons .. Etc • LUBM – Synthetically generated RDF data containing universities, departments, students etc. Dataset size: • DataSet1: 15,472,624 triples; 2.1 GB • DataSet 2: LUBM 50 – 2.75 Million & LUBM 1000 – 55.09 Million • 3 Queries
Results – Query 1 • Simple select query – 2 variables
Query 2 • Unconstrained Select Query – only predicate was specified.
Query 3 • Complex Query – Uses filter
Observations • Native Stores perform better than systems using third party stores. • Optimizations are possible • Each of the systems uses different database layouts. • Virtuoso • Hashing on SDB is very bad.
Open Research Issues • Query Optimization • Better performance of native stores points to that direction. • Some work in optimizing SPARQL queries for in-memory story.
Which RDF store to choose for an app? • Frequency of loads that the application would perform. • Single scaling factor and linear load times. • Level of inferencing. • Support for which query language. W3C recommendations.