220 likes | 376 Views
Sesame A generic architecture for storing and querying RDF and RDFs. 2008. 12. 08 Written by Jeen Broekstra, Arjohn Kampman Summarized by Gihyun Gong. About Sesame project. Developed by IST On-To-Knowledge project that ran from 1999 to 2002
E N D
SesameA generic architecture for storing and querying RDF and RDFs 2008. 12. 08 Written by Jeen Broekstra, Arjohn Kampman Summarized by Gihyun Gong
About Sesame project • Developed by IST On-To-Knowledge project that ran from 1999 to 2002 • Focused on Knowledge and Contents Technology, Software Technology • Now, it is further developed and maintained by Aduna in cooperation with NLnet Foundation, developers from Ontotext
What is Sesame? • Sesame is an open source Java framework for storing, querying and reasoning with RDF and RDF Schema • It can be used as: • Standalone Server: A database for RDF and RDF Schema • Java Library: For applications that need to work with RDF internally • Sesame is similar to Jena • Supports triple storage • Supports reasoning • Supports Web services
Sesame’s Architecture Clients Clients HTTP SOAP HTTP Protocol Handler SOAP Protocol Handler Sesame Admin Module Query Module Export Module Repository Abstraction Layer (RAL) Repository
The Repository • DBMSs • Currently, Sesame is able to use • PostgreSQL • MySQL • Oracle (9i or newer) • Existing RDF stores • RDF files • RDF network services • Using multiple sesame server to retrieve results for queries • This opens up the possibility of a highly distributed architecture for RDF storing and querying
Repository Abstraction Layer (RAL) • RAL offers stable, high-level interface for talking to repositories • It is defined by an API that offers these functionalities: • Add data • Retrieve data • Delete data • Data is returned in streams (Scalability) • Only small amount of data is kept in memory • Suitable for use in highly constrained environments such as portable devices • Caching data (Performance) • E.g. caching RDF schema data which is needed very frequently
Admin Module • Allows inserting or deleting RDF data in repository • Retrieves its information from an RDF(S) source, and parses it using an RDF parser • Checks each (S, P, O) statement for consistency and infers implied information if necessary for instance: • If P equals type, it infers that O must be a class. • If P equals subClassOf, it infers that S and O must be classes. • If P equals subPropertyOf, then it infers that both S and O must be properties. • If P equals domain or range, then it infers that S must be a property and O must be a class
Query Module • Evaluates RQL queries posed by the user • It is independent of the underlying repository • Can not use optimizations and query evaluations offered by specific DBMSs • RQL queries are translated into a set of calls to the RAL • e.g. when a query contains a join operation over two subqueries, each of the subqueries is evaluated, and the join operation is then executed by the query engine on the results
RDF Export Module • This module allows for the extraction of the complete schema and/or data from a model in RDF format • It supplies the basis for using Sesame with other RDF tools
SeRQL (Sesame RDF Query Language) • Extension of RQL • Some of the built-in predicates: • {X} serql:directSubClassOf {Y} • {X} serql:directSubPropertyOf {Y} • Some of the built-in functions • isLiteral() • isResource()
Important Features of Sesame • Portability • It is written completely in Java • Repository independence • Provides RAL • Extensibility • Other functional modules can be created and be plugged in it • Flexible communication by using protocol handlers • The architecture separates the communication details from the actual functionality through the use of protocol handlers
Using PostgreSQL as Repository • PostgreSQL is an open-source object-relational DBMS • It supports subtable relations between its tables • Subtable relations are also transitive • These relations can be used to model the subsumption reasoning of RDF schema.
Example RDF Schema & Data domain range Writer hasWritten Book subClassOf FamousWriter Schema Data type type hasWritten …/twain/mark …/ISBN00023423442
Storing Schema (PostgreSQL) Class SubClassOf SubPropertyOf Property Domain Range
Storing Data (PostgreSQL) Resource Book Writer FamousWriter hasWritten
Scalability Issues • An experiment using Sesame: • Uploading and querying a collection of nouns from Wordnet(http://www.semanticweb.org/library) • Consisting of about 400,000 RDF statements. • Using a workstation (Sun UltraSPARC 5, 256MB RAM) • Uploading the Wordnet nouns took 94 minutes. • Querying was quite slow (in MySQL) • Because data is distributed over multiple tables, and retrieving data needs doing many joins on tables
Future Work • Transaction Rollback Support • Aims an ACID compliant storage system • Versioning support • Adding and extending functional modules • Support ‘Update’ operation • DAML+OIL support
Comparison with others • Database Compatibility • API Compatibility
Comparison with others • Tool Support • Query Language Support
Comparison with others • Reasoning level and Scalability
References • Sesame : A Generic architecture for storing and querying RDF and RDFs, http://sesame.aidministrator.nl/ • Create Scalable Semantic Applications with Database-Backed RDF Stores, www.devx.com • OpenRDF - Sesame Benchmark, http://bklab.snu.ac.kr/blog/kwangsub/53