1 / 28

Comparison of Triple Stores

Comparison of Triple Stores. Omar Aaziz. Contents . Introduction RDF Ontology Different Architectures An Example : Jena SDB Evaluations Evaluations using LUBM/DBPedia Open Research Issues Which RDF Store to choose for a particular application?. Introduction. What is an RDF store?

pascal
Download Presentation

Comparison of Triple Stores

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Comparison of Triple Stores Omar Aaziz

  2. Contents • Introduction • RDF • Ontology • Different Architectures • An Example : Jena SDB • Evaluations • Evaluations using LUBM/DBPedia • Open Research Issues • Which RDF Store to choose for a particular application?

  3. Introduction • What is an RDF store? A system to provide a mechanism for persistent storage and access of RDF graphs.

  4. Resource Description Framework • A framework (not a language) for describing resources • Model for data • Syntax to allow exchange and use of information stored in various locations • The point is to facilitate reading and correct use of information by computers, not necessarily by people

  5. Identification and description • RDF identifies resources with URIs • Often, though not always, the same as a URL • Anything that can have a URI is a RESOURCE • RDF describes resources with properties and property values • A property is a resource that has a name • Ex. Author, Book, Address, Client, Product • A property value is the value of the Property • Ex. “Joanna Santillo,” http://www.someplace.com/, etc. • A property value can be another resource, allowing nested descriptions.

  6. Statements • subject, predicate, object of a statement • Predicates are not the same as English language verbs. • Specify a relationship between the subject and the object

  7. Statement: "The author of http://www.w3schools.com/RDF is Jan Egil Refsnes". Subject: http://www.w3schools.com/RDF Predicate: author Object: Jan Egil Refsnes Example

  8. RDF offers only binary predicates. Think of them as P(x,y) where P is the relationship between the objects x and y. From the example, X = http://www.w3schools.com/RDF Y = Jan Egil Refsnes P = author Binary predicates http://www.w3schools.com/RDF Jan Egil Refsnes author

  9. <?xml version="1.0"?><rdf:RDFxmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:cd="http://www.recshop.fake/cd#"> <rdf:Description rdf:about="http://www.recshop.fake/cd/Empire Burlesque"> <cd:artist>Bob Dylan</cd:artist> <cd:country>USA</cd:country> <cd:company>Columbia</cd:company> <cd:price>10.90</cd:price> <cd:year>1985</cd:year></rdf:Description><rdf:Description rdf:about="http://www.recshop.fake/cd/Hide your heart"> <cd:artist>Bonnie Tyler</cd:artist> <cd:country>UK</cd:country> <cd:company>CBS Records</cd:company> <cd:price>9.90</cd:price> <cd:year>1988</cd:year></rdf:Description>… </rdf:RDF>. Root element of RDF documents Source of namespace for elements with rdf prefix Source of namespace for elements with cd prefix Description element describes the resource identified by the rdf:about attribute. Cd:country etc are properties of the resource.

  10. RDF validator • Check the correctness of an RDF document: • http://www.w3.org/RDF/Validator/ • Result shows the subject, predicate and object of each element of the document and a graph of the model.

  11. OWL • Web Ontology Language • Official W3C Standard since Feb 2004 • A Web Language: Based on RDF(S) • An Ontology Language: Based on logic

  12. OWL Ontologies • What’s inside an OWL ontology • Classes + class-hierarchy • Properties (Slots) / values • Relations between classes(inheritance) • Restrictions on properties (type, cardinality)

  13. OWL Use Cases • At least two different user groups • OWL used for terminologies or knowledge models • OWL used as data exchange language(define interfaces of services and agents)

  14. City Beach Cairns Sydney BondiBeach CurrawongBeach Classes • Sets of individuals with common characteristics • Individuals are instances of at least one class

  15. hasPart hasAccomodation Sydney BondiBeach FourSeasons ObjectProperties • Link two individuals together • Relationships (0..n, n..m)

  16. Different Architectures • Based on their implementation, can be divided into 3 broad categories : In-memory, Native, Non-native Non-memory. • In – Memory : RDF Graph is stored as triples in main –memory. Eg. Storing an RDF graph using Jena API/ Sesame API. • Native : Persistent storage systems with their own implementation of databases. Eg. Sesame Native, Virtuoso, AllegroGraph, Oracle 11g. • Non-Native Non-Memory : Persistent storage systems set-up to run on third party DBs. Eg. Jena SDB.

  17. Jena SDB • SDB is a persistent triple store using relational databases. • SDB basically is a Java Loader. • Multiple stores supported: MySQL, PostgreSQL, Oracle, DB2. • Takes incoming triples and breaks them down into components ready for the database. • SPARQL supported. (Non) Interest Declaration: I was previously an intern at HP Labs with the Jena team

  18. Evaluations • Third party evaluations for Sesame, Jena SDB, Virtuoso • Oracle 11g company evaluations • Methodology • LUBM – Lehigh University BenchMark • DBPedia - http://wiki.dbpedia.org/Ontology?show_comments=1 • Multiple Queries • Load Times

  19. Evaluations • DB Pedia – Database of structured information extracted from Wikipedia. Information about places, persons .. Etc • LUBM – Synthetically generated RDF data containing universities, departments, students etc. Dataset size: • DataSet1: 15,472,624 triples; 2.1 GB • DataSet 2: LUBM 50 – 2.75 Million & LUBM 1000 – 55.09 Million • 3 Queries

  20. Loading Time-DataSet1

  21. Results – Query 1 • Simple select query – 2 variables

  22. Query 2 • Unconstrained Select Query – only predicate was specified.

  23. Query 3 • Complex Query – Uses filter

  24. Oracle 11g – DataSet 2

  25. Observations • Native Stores perform better than systems using third party stores. • Optimizations are possible • Each of the systems uses different database layouts. • Virtuoso • Hashing on SDB is very bad.

  26. Open Research Issues • Query Optimization • Better performance of native stores points to that direction. • Some work in optimizing SPARQL queries for in-memory story.

  27. Which RDF store to choose for an app? • Frequency of loads that the application would perform. • Single scaling factor and linear load times. • Level of inferencing. • Support for which query language. W3C recommendations.

  28. Questions ?

More Related