1.27k likes | 1.64k Views
Tools & Frameworks for the Semantic Web. Semantic Web - Fall 2005 Computer Engineering Department Sharif University of Technology. Outline. Jena A Semantic Web framework Sesame An architecture for storing and querying RDF data. Protégé
E N D
Tools & Frameworksfor the Semantic Web Semantic Web - Fall 2005 Computer Engineering Department Sharif University of Technology
Outline • Jena • A Semantic Web framework • Sesame • An architecture for storing and querying RDF data. • Protégé • An environment for creating and editing ontologies and knowledge bases.
Introduction • Jena is a Java framework for building Semantic Web applications. • Jena is open source software developed in HP Labs. • The Jena framework includes: • A RDF API • Reading and writing RDF in RDF/XML, N3 and N-Triples • An OWL API • In-memory and persistent storage • RDQL support
Jena Versions • Two versions: • Jena 1 • Expressive support for RDF • Limited reasoning facilities (RDQL) • Jena 2 • Ontology API included • Support for OWL included
RDF Model • Resource: • Anything that can be described with an RDF expression. • Property: • Trait, attribute or relation used to describe a resource. • Literal: • Simple data type (String, Integer, etc). • Statement: • Resource, united with the property and its associated value. • An RDF Model is a set of statements.
RDF API of Jena • Allows creating and manipulating RDF Models from Java applications. • Provides Java classes to represent: • Models. • Resources. • Properties. • Literals. • Statements.
Create an RDF Model String personURI = "http://somewhere/JohnSmith"; String givenName = "John"; String familyName = "Smith"; String fullName = givenName + " " + familyName; // create an empty model Model model = ModelFactory.createDefaultModel(); Resource johnSmith = model.createResource(personURI); johnSmith.addProperty(VCARD.FN, fullName); johnSmith.addProperty(VCARD.N, model.createResource() .addProperty(VCARD.Given, givenName) .addProperty(VCARD.Family, familyName));
Writing and Reading the Model • To serialize the model in XML: model.write(System.out); • To load a model in the memory: Model model = ModelFactory.createDefaultModel(); model.read(“file:c:/example.owl”);
Navigating the Model • Getting information via the URI of the resource: // retrieve the resource John Smith String johnSmithURI = "http://somewhere/JohnSmith"; Resource jSmith = model.getResource(johnSmithURI); // retrieve the value of the property N Resource name = (Resource) jSmith.getProperty(VCARD.N) .getObject(); // retrieve the value of the property FN String fullName = (String) jSmith.getProperty(VCARD.FN) .getObject();
Referring to a Model • Searching information in a model: // retrieve all the resources of the type vcard // (assuming that all such resources have a property FN) ResIterator it = model.listSubjectsWithProperty(VCARD.FN); while (it.hasNext()) { Resource r = it.nextResource(); System.out.println(r); } • More advanced querying: • Use of construction listStatements(Selector s). • Use of RDQL.
Operations on Models • A model is a set of statements. • Support of the following operations: • Union. • Intersection. • Difference. // reading the RDF models model1.read(new InputStreamReader(in1), ""); model2.read(new InputStreamReader(in2), ""); // unifying RDF models Model model = model1.union(model2);
Ontology API of Jena • Supports RDF Schema, DAML, DAML+OIL and OWL. • Language independent.
Creation of an Ontology Model • Use of method createOntologyModel(). • Possible to specify: • Used language. • Associated reasoning. String fileName = "c:/ejemplo.owl"; String baseURI = "file:///" + fileName; OntModel model = ModelFactory.createOntologyModel(ProfileRegistry.OWL_DL_LANG); model.read(new FileReader(schemaFileName), baseURI);
Classes • Classes are basic construction blocks. • Represented as OntClass. • Example: Obtain subclasses of class Camera. OntClass camera = model.getOntClass(camNS + "Camera"); for (Iterator i = camera.listSubClasses(); i.hasNext();) { OntClass c = (OntClass) i.next(); System.out.println(c.getLocalName()); }
Properties • Represented via OntProperty. OntModel m = ModelFactory.createOntologyModel(); OntClass Camera = m.createClass(camNS + "Camera"); OntClass Body = m.createClass(camNS + "Body"); ObjectProperty part = m.createObjectProperty(camNS + "part"); ObjectProperty body = m.createObjectProperty(camNS + "body"); body.addSuperProperty(part); body.addDomain(Camera); body.addRange(Body);
Complex Classes • It is possible to define classes by means of operations for union, intersection, difference. • Example: <owl:Class rdf:ID="SLR"> <owl:intersectionOf rdf:parseType="Collection"> <owl:Class rdf:about="#Camera"/> <owl:Restriction> <owl:onProperty rdf:resource="#viewFinder"/> <owl:hasValue rdf:resource="#ThroughTheLens"/> </owl:Restriction> </owl:intersectionOf> </owl:Class>
// create instance throughTheLens OntClass Window = m.createClass(camNS + "Window"); Individual throughTheLens = m.createIndividual(camNS + "ThroughTheLens", Window); // create property viewfinder ObjectProperty viewfinder = m.createObjectProperty(camNS + "viewfinder"); // create restriction hasValue HasValueRestriction viewThroughLens = m.createHasValueRestriction(null, viewfinder, throughTheLens); // create class Camera OntClass Camera = m.createClass(camNS + "Camera"); // create intersection for defining class SLR IntersectionClass SLR = m.createIntersectionClass(camNS + "SLR", m.createList(new RDFNode[] {viewThroughLens, Camera}));
Schema vs Instance Data • Schema • Possible to define: • Classes • Properties (DataTypeProperty, ObjectProperty) • Restrictions • Types of Data • Cardinality • Instance Data • Defining instances (individuals) of the Schema elements.
<owl:Class rdf:ID="Camera"> <rdfs:subClassOf rdf:resource="#Item"/> </owl:Class> <owl:DatatypeProperty rdf:ID="name"> <rdfs:domain rdf:resource="#Camera"/> <rdfs:range rdf:resource=“xsd:string"/> </owl:DatatypeProperty> <camera:Camera rdf:ID="camera1"> <camera:name>Kodak</camera:name> </camera:Camera>
Managing Instance Data URI = http://test/camera/#Camera OntClass c = model.getOntClass(URI +#Camera") OntProperty p = model.getOntProperty(URI +#name") Individual ind = model.getIndividual(URI +#camera1") if (ind.hasProperty(p)) Statement st = ind.getProperty(p); Object l = (Object)st.getObject();
Managing Instance Data • Other operations: • model.listIndividuals() • Returns all instances of the model • individual.hasProperty(Property p,Object o) • Returns True if there is an individual having property p with value o • ontClass.listInstances(); • Returns all instances of the class
Jena Inference Support • Inference Deduce additional information • The task of inferencing is carried out by reasoners • Jena comprises a set of basic reasoners • OWL Reasoner • DAML Reasoner • RDF Rule Reasoner • Generic Rule Reasoner • There is a way to include new reasoners • For example: (?A rdfs:subClassOf ?B) (?B rdfs:subClassOf ?C) (?A rdfs:subClassOf ?C)
Jena Inference Support • To reason, an Inference Model should be created • Example: Reasoner reasoner = ReasonerRegistry.getOWLReasoner(); reasoner = reasoner.bindSchema(schema); InfModel modelInf = ModelFactory.createInfModel(reasoner,data);
Ontology Validation in Jena Model schema = ModelLoader.loadModel("file:c:/Schema.owl"); Model data = ModelLoader.loadModel("file:c:/example.owl"); Reasoner reasoner = ReasonerRegistry.getOWLReasoner(); reasoner = reasoner.bindSchema(schema); InfModel modelInf = ModelFactory.createInfModel(reasoner, data); ValidityReport vrp1 = modelInf.validate(); if (vrp1.isValid()){ System.out.println(“Valid OWL"); }else { System.out.println(“Not valid OWL"); for (Iterator i = vrp1.getReports(); i.hasNext();){ System.out.println(" - " + i.next()); }
<camera:Camera rdf:ID="camera1"> <camera:name>KODAK</camera:name> </camera:Camera> <owl:DatatypeProperty rdf:ID="name"> <rdfs:domain rdf:resource="#Camera"/> <rdfs:range rdf:resource=“xsd:integer"/> </owl:DatatypeProperty> Error (range check): Incorrectly typed literal due to range (prop, value) Culprit = http://www.xfront.com/owl/ontologies/camera/#camera1 Implicated node: http://www.xfront.com/owl/ontologies/camera/#name Implicated node: 'KODAK‘
<owl:Class rdf:ID="Camera"> <rdfs:subClassOf> <owl:Restriction> <owl:onProperty rdf:resource="#name" /> <owl:maxCardinality rdf:datatype=“xsd:nonNegativeInteger">1</owl:maxCardinality> </owl:Restriction> </rdfs:subClassOf> </owl:Class> <camera:Camera rdf:ID="camera1"> <camera:name>KODAK</camera:name> <camera:name>OLIMPUS</camera:name> </camera:Camera> Error (too many values): Too many values on max-N property (prop, class) Culprit = http://www.xfront.com/owl/ontologies/camera/#camera1 Implicated node: http://www.xfront.com/owl/ontologies/camera/#name Implicated node: http://www.xfront.com/owl/ontologies/camera/#Camera
RDQL • q1 contains a query: SELECT ?x WHERE (?x <http://www.w3.org/2001/vcard-rdf/3.0#FN> "John Smith") • For executing q1 with a model m1.rdf: java jena.rdfquery --data m1.rdf --query q1 • The outcome is: x ============================= <http://somewhere/JohnSmith/>
Using RDQL from Java Code • It is possible to run RDQL queries from the Java application. • The following classes are to be used for this: • Query • QueryExecution • QueryEngine • QueryResults • ResultBinding
RDQL Example SELECT ?x, ?fname WHERE (?x <http://www.w3.org/2001/vcard-rdf/3.0#FN> ?fname) Query query = new Query("SELECT...") ; query.setSource(model); QueryExecution qe = new QueryEngine(query) ; QueryResults results = qe.exec(); for (Iterator iter = results; iter.hasNext();) { ResultBinding res = (ResultBinding) iter.next(); Resource x = (Resource) res.get("x"); Literal fname = (Literal) res.get("fname"); System.out.println("x: " + x + " fname: " + fname); }
Persistent Models • Jena permits to create persistent models: • such as with relational databases. • Jena 2 supports: • MySQL • Oracle • PostgreSQL • To create a persistent model: • ModelFactory.createModelRDBMaker(conn).createModel()
Example // Create a connection to DB DBConnection c = new DBConnection(DB_URL, DB_USER, DB_PASS, DB_TYPE); // Create a ModelMaker for persistent models ModelMaker maker = ModelFactory.createModelRDBMaker(c); // Create a new model Model model = maker.createModel("modelo_1"); // Start transaction model.begin(); // Read a model from an XML archive model.read(in, null); // Commit a transaction model.commit();
Sesame When they were out of sight Ali Baba came down, and, going up to the rock, said, "Open, Sesame.“ --Tales of 1001 Nights
Querying Levels • RDF documents can be considered at three different levels of abstraction: • At the syntactic level they are XML documents. • At the structure level they consist of a set of triples. • At the semantic level they constitute one or more graphs with partially predefined semantics. • Querying at what level is the best?
Querying at the Syntactic Level • In this level we just have an XML document. • So we can Query RDF using an XML query language. (e.g. XQuery) • But RDF is not just an XML dialect. • XML: • Has a tree structure data model. • Only nodes are labeled. • RDF: • Has a graph structure data model. • Both edges (properties) and nodes (subjects/objects) are labeled. • Different ways of encoding the same information in XML are possible.
Querying at the Structure Level • In this level RDF document represents a set of triples: • (type, Book, Class) • (subClassOf, FamousWriter, Writer) • (hasWritten, twain/mark, ISBN00023423442) • (type, twain/mark, FamousWriter) • Advantage: Independent of the specific XML syntax. • A successful query: • SELECT ?x FROM … WHERE (type ?x FamousWriter) • An unsuccessful query: • SELECT ?x FROM … WHERE (type ?x Writer)
Querying at the Semantic Level • We need a query language that is sensitive to the RDF Schema primitives: • e.g. Class, subClassOf, Property, … • RQL • RDF Query Language • The first proposal for a declarative query language for RDF and RDF Schema. • Output of queries is again legal RDF schema code, which can be used as input of another query. • A sample query: • SELECT Y FROM FamousWriter {X}. hasWritten {Y}
Sesame – Introduction & History • Sesame: An Architecture for Storing and Querying RDF Data and Schema Information. • The European On-To-Knowledge project kicked off in Feb. 2000: • This project aims at developing ontology-driven knowledge management tools. • In this project Sesame fulfills the role of storage and retrieval middleware for ontologies and metadata expressed in RDF and RDF Schema.
On-To-Knowledge Project • Sesame is positioned as a central tool in this project. • OntoExtract: extracts ontological conceptual structures from natural-language documents. • OntoEdit: An ontology editor. • RDF Ferret: A user front-end, that provides search and query. RDF Ferret Sesame OntoEdit OntoExtract
What is Sesame? • Sesame is an open source Java framework for storing, querying and reasoning with RDF and RDF Schema. • It can be used as: • Standalone Server: A database for RDF and RDF Schema. • Java Library: For applications that need to work with RDF internally.
Sesame’s Architecture HTTP SOAP HTTP Protocol Handler SOAP Protocol Handler Sesame Admin Module Query Module Export Module Repository Abstraction Layer (RAL) Repository
The Repository • DBMSs • Currently, Sesame is able to use • PostgreSQL • MySQL • Oracle (9i or newer) • SQL Server • Existing RDF stores • RDF flat files • RDF network services • Using multiple sesame server to retrieve results for queries. • This opens up the possibility of a highly distributed architecture for RDF(S) storing and querying.
Repository Abstraction Layer (RAL) • RAL offers stable, high-level interface for talking to repositories. • It is defined by an API that offers these functionalities: • Add data • Retrieve data • Delete data • Data is returned in streams. (Scalability) • Only small amount of data is kept in memory. • Suitable for use in highly constrained environments such as portable devices. • Caching data (Performance) • E.g. caching RDF schema data which is needed very frequently.
Admin Module • Allows incrementally inserting or deleting RDF data in/from repository. • Retrieves its information form an RDF(S) source • Parses it using an RDF parser • Checks each (S, P, O) statement it gets from the parser for consistency with the information already present in the repository and infers implied information if necessary for instance: • If P equals type, it infers that O must be a class. • If P equals subClassOf, it infers that S and O must be classes. • If P equals subPropertyOf, then it infers that both S and O must be properties. • If P equals domain or range, then it infers that S must be a property and O must be a class.
Query Module • Evaluates RQL queries posed by the user • It is independent of the underlying repository. So it can not use optimizations and query evaluations offered by specific DBMSs. • RQL queries are translated into a set of calls to the RAL. • e.g. when a query contains a join operation over two subqueries, each of the subqueries is evaluated, and the join operation is then executed by the query engine on the results.
RDF Export Module • This module allows for the extraction of the complete schema and/or data from a model in RDF format. • It supplies the basis for using Sesame with other RDF tools.
Important Features of Sesame • Powerful query language • Portability • It is written completely in Java. • Repository independence • Extensibility • Other functional modules can be created and be plugged in it. • Flexible communication by using protocol handlers • The architecture separates the communication details from the actual functionality through the use of protocol handlers.