570 likes | 587 Views
Semantic Web Servers. Engineering the Semantic Web. Graham Moore Ontopia moore@ontopia.net. Overview. A vision of the Semantic Web The State We’re In The Generic Missing Piece – Semantic Web Protocol Use cases for distributed processing Aspects of semantic web servers
E N D
Semantic Web Servers Engineering the Semantic Web Graham Moore Ontopiamoore@ontopia.net
Overview • A vision of the Semantic Web • The State We’re In • The Generic Missing Piece – Semantic Web Protocol • Use cases for distributed processing • Aspects of semantic web servers • An evaluation of existing/proposed protocols • Semantic web protocols: • RDF Net API • Topic Maps and Fragment Processing • Conclusions, Issues and Further work
A Vision of the Semantic Web My Published Diary My Dentist’s Schedule SW Data SW Data Semantic Web My Agents Agents
The Reality of the Semantic Web My Published Diary My Dentist’s Schedule SW Data SW Data Semantic Web Unplugged My Agents Agents
The State We’re In • We currently have • Standards that can support SW activities • Representation • RDF, Topic Maps • Constraints • RDF Schema, OWL • And coming soon… TMCL, • Query Languages • Again any moment now… TMQL, RDF QL • Tools to make it happen • There are now a number of RDF and Topic Map tools that can be used for the management and deployment of topic maps solutions. • Sounds Great! – So what is the problem…
The State We’re In • The tools and standards we have right now are great for stand alone single server knowledge push solutions. • e.g. Ontopia is developing an educational tool that allows teachers and students to collaboratively develop topic maps that describe a subject area. • The architecture is an OKS server accessed by users via a web application • the wire: only HTTP and HTML, despite SW tech on server • This is a very common architecture for projects using the Topic Map or RDF paradigm. • This is just one architecture that is required in order to fulfil the vision of the semantic web. • What we are missing is the • ‘Standardisation of operations that can be invoked in a distributed environment’
Semantic Web Layer Cake Where is the Protocol?
Semantic Web Protocol • Communication protocols for the semantic web have been ignored. For the Semantic Web to really work, to gain adoption this needs to be rectified.
Semantic Web Protocol • For semantic web clients to be able to talk with servers of RDF and Topic Maps, or for semantic web peers to communicate we need standardized protocols. • Currently the only mechanisms we have for semantic web communication is either: • Using existing HTTP protocols to access and put RDF and Topic Map XML documents onto web servers • Proprietary protocols • HTTP protocols are not SW aware and thus the power of the data (rdf / topic maps) cannot be exposed or exploited! • Proprietary protocols – well, this is not really the way we want the web to develop
Semantic Web Protocol Use Cases • Why do we need a SW Protocol? • Web Clients that wish to pose a SW query about a resource they are displaying to the user. • Internet explorer has a SW window that shows SW data about a resource • Client applications that are creating SW data that they want to aggregate on a server to share it with other client applications. • My calendar client application creates SW data and wishes to publish it from a single central server • Business applications that are producing SW data based on SW data exposed by other business applications • A stock control system queries a transaction processing system and then publishes sw data for review by the store manager • Data integration from multiple distributed data applications • A web application wishes to expose data from a number of sources. It needs to query several SW applications and then expose this aggregated data as dynamic web pages
Aspects of a SW Server • Defn: Semantic Web Server • ‘A piece of software that implements semantic web protocols in order to service clients that may want to query and update semantic web data’ • Criteria: • Update capability • Query capability • Ease of deployment • Transaction Support • Ease of implementation • Server Introspection • Identity resolution • Security and Auditing • Implementation footprint
An evaluation of existing/proposed protocols • HTTP • URIQA
HTTP • We don’t want RDF exposed as RDF XML files • We don’t want Topic Maps exposed as XTM files • Note: this doesn’t mean in either case that we don’t want to receive some data in these formats we just don’t want to operate in terms of these formats. • No way to query, • No way to perform updates unless working at the file level • No introspection (as to the semantic behaviours) • No TXN Support, Identity Resolution • Don’t want to mandate a single implementation strategy.
HTTP (2) • This is part of the reason why the semantic web has yet to gain traction! • Clients don’t know how to access semantic web data in a standard way
URIQA • URI Query Agent, Patrick Stickler (Nokia) • HTTP extension that given a URI it will return a concise bounded description. • All statements where the subject of the statement is the URI in question. • Iteratively, for all statements included in description thus far, for all anonymous node objects, all statements where the subject of the statement is that anonymous node. • Iteratively, for all statements included in description thus far, all statements relating to their reification. • Easy to deploy, implement • Basic query support, • Not very expressive. • Good enough to do something very useful • No update support • Small footprint • No introspection • Not a general mechanism for interacting with RDF Models.
Next generation SWS protocols • While the above protocols provide some of the features that are desirable in a SWS protocol they are far from adequate to truly enable the semantic web. • Next we present two related protocols that attempt to fulfil the key SWS requirements. • RDF Net API • Topic Map Fragment Processing
RDF Net API • Background: • Developed by Andy Seaborne (HP) and Graham Moore after separate but related works • Joseki • Empolis k42 Semantic Web Server • First drafted at the 1st Semantic Web Conference in Sardinia • Since revised and recently submitted to W3C. • Goal: To define a protocol that would enable the semantic web by providing a remote protocol for querying and updating RDF Models.
SWS Architecture Overview RDF Model Impl (Jena) RDF Net API Processing Layer Messages (up) , SW Data (down) Client Application Client Application
SWS Architecture Overview (2) Business Application Business Application RDF Layer RDF Net API Processing Layer Messages (up) , SW Data (down) Client Application Client Application
Definition of an abstract protocol • We did not want an XML based language • We did not want to do the syntax first • We defined an abstract protocol • This allows many different implementations to be written • This allows different transport/message protocols to be used • This should allow us to define the semantics of the operations in a robust fashion • It seemed like the sensible thing to do.
RDF Net API - Overview • Query • GetStatements • InsertStatements • RemoveStatements • Put Statements • Update Statements • Options
RDF Net API – Query RDF Net API Processing Layer Op-Prototype: query(ModelReference, Query, QueryLang, ResultsFormat) => StatementSet ModelReference: Reference to the target model for this operation Query: The query to be executed QueryLanguage: Indication of the query language ResultsFormat: Indication of the format of the results to be returned as a set of statements StatementSet: Set of statements returned Client
RDF Net API – GetStatements RDF Net API Processing Layer Op-Prototype: getStatements(ModelReference, Subject, Predicate, Object) => StatementSet ModelReference: Reference to the target model for this operation Subject: URI or * (wildcard) Predicate: URI or * Object: URI,literal or * StatementSet: Set of statements returned Client
RDF Net API – InsertStatements RDF Net API Processing Layer Op-Prototype: insertStatements(ModelReference, StatementSet) ModelReference: Reference to the target model for this operation StatementSet: Set of RDF statements for the operation Client
RDF Net API – RemoveStatements RDF Net API Processing Layer Op-Prototype: removeStatements(ModelReference, StatementSet) ModelReference: Reference to the target model for this operation StatementSet: Set of RDF statements for the operation Client
RDF Net API – PutStatements RDF Net API Processing Layer Op-prototype: putStatements(ModelReference, StatementSet) ModelReference: Reference to the target model for this operation StatementSet: Set of RDF statements for the operation Client
RDF Net API - UpdateStatements RDF Net API Processing Layer Op-prototype: updateStatements(ModelReference, RemoveSet , InsertSet) ModelReference: Reference to the target model for this operation RemoveSet: Set of RDF statements to be removed InsertSet: Set of RDF statements to be inserted Client
RDF Net API - Options RDF Net API Processing Layer Op-prototype: options(ModelReference) => StatementSet ModelReference: Reference to the target model for this operation StatementSet: Results of the operation Client
RDF Net API - Bindings • Although we have an abstract protocol we wanted the document to have some concrete implementation bindings. • We chose to define an HTTP and SOAP binding for the protocol
RDF Net API – HTTP Binding • Uses GET with parameters for Query and GetStatements e.g. • GET http://example.com/foo HTTP/1.1 • POST is used for Update, Insert and Remove • PutStatements uses HTTP PUT • Options are retrieved by using HTTP OPTIONS • All data is received and sent as RDF XML, although could support N3 etc
RDF Net API – SOAP Binding • To enable Semantic Web Servers to be generic Semantic Web Services we decided to define a SOAP binding for the API. • But before we could start mapping the operations we needed to define an RDF Data Model representation in XML Schema.
RDF Data Model in XML Schema <types> <schema targetNamespace='http://www.semanticwebserver.com/rdfnetservice' …> <complexType name="rdfstatement“ <sequence> <element name="subject" xsd:type="xsd:string" /> <element name="predicate" xsd:type="xsd:string" /> <element name="object" xsd:type="xsd:string" /> <element name="isObjectLiteral" xsd:type="xsd:boolean"/> </sequence> </complexType> <complexType name="rdfstatementvector"> <all> <element name="item" type="tns:rdfstatement" minOccurs="0" maxOccurs="unbounded"/> </all> </complexType> </schema> </types>
Example SOAP Binding – Update Statements <message name="updateStatementsRequest"> <part name="modelid" type="xsd:string" /> <part name="statementToRemove" type="tns:rdfstatementvector"/> <part name="statementToAdd" type="tns:rdfstatementvector"/> </message> <operation name="updateStatements"> <input message="updateStatementsRequest" /> </operation>
RDF Net API - Evaluation • Simple yet powerful set of operations • Easy to implement and deploy • Some transaction support (enabled by the no edit policy) • Small footprint • Open query capability (this is the best we can do until something is standardized!) • Update operations provided • Introspection support in an extensible fashion
Issues & Questions • No security or auditing • Although could argue that HTTPS and digital signatures could be used to achieve this to some degree • Why cant the query language do it all? • If we have update in RDFQL why do we need these operations? • Where is the formal definition of the operation semantics? • Why is it so simple?
RDF Net API - Summary • RDF Net API is intended to be a SW enabler. We see it as being the SAX of the Semantic Web. • This protocol provides a starting point to get the semantic web talking – this is an issue that has been ignored for too long!
Now hang on just one minute!! • Where is the Topic Map protocol?
Topic Map Servers – SWS in disguise • Topic Map Server • Topic Map Fragments • Fragment processing protocol • Some aspects of the approach taken with the RDF Net API can be borrowed in defining a Topic Map Server. • HTTP, SOAP bindings tactic • The no update policy • Effectively we will re-use the infrastructure and the general shape of the protocol but replace RDF with Topic Maps.
Basic Building Block – Topic Map Fragments • Issue: A topic map is a graph of interconnected nodes. In a distributed application it is undesirable to transport all of these nodes from one application to another. What is required is the ability to send a small part of the graph, a topic map fragment, to another application. • The core issue is how to deal with /resolve • Dangling pointers • How much graph to grab • There have been a number of proposals floated for how this should actually work.
Topic Map Fragment Algorithm • When creating a fragment there are three variables that define which part, and how much of the topic map graph to return • The first piece of information that we need is where to start from: • So we need some Topic or Association identifier (this could also be a TMQL query) • Given that a topic map can be seen as a graph we need a ‘depth’ property to indicate how many ‘hops’ in the graph we should make. • Finally, Topics themselves are complex structures consisting of names, occurrences, identities etc, so we want a ‘detail’ property. This is some indicator about which topic details should be returned as part of the Topic. This could also be a localised TMQL expression.
Example • Fragment request : • Topic Subject Address : http://www.ontopia.net/docs/fragments.html • Depth : 3 • Detail : ./names[1] • Would return the topic indicated plus topics to within a depth of 3 (there are several alternative interpretations of depth) and any topics returned will only contain one name (if it exists). • A refinement on this could be to map detail expressions to depth i.e. • dd(1, ./names[1] ; ./occurs[2]), … dd(4 , ./names[1])
A More generalised approach • Replace selection of fragment constituents to be a single TMQL query.
Dangling Topics When the fragment is generated special system specific ids are used. These are recognised by the client application and can be used in subsequent requests to retrieve further fragments.
Topic Map Fragment as an enabler • Whatever the actual algorithm adopted we can make use of Topic Map fragments in conjunction with a control protocol to implement a Topic Map Server.
A Protocol for Topic Map Server • Issues that don’t exist with RDF: • No complex structures • i.e. only triples • Makes the no update policy easier to support • Query • Add • Remove • Update • options
Topic Map Server - Query • Essentially the same as the RDF Net API except in the nature of the query (TMQL) and the nature of the result (A Topic Map fragment)
Topic Map Server – Add / Remove Issues • Both add and remove send a topic map fragment to the server and expect the information contained within the fragment to be added/ or removed from the topic map. • While this is fine for adding or removing complete topics and associations it is not adequate for the addition of names, occurrences and identities to topics nor for their removal.
Topic Map Server – Fragment Contexts <topic id=“t-1”> <baseName> <baseNameString>Semantic Web Server</baseNameString> </baseName> </topic> If we are adding this, do we want to add a new topic with a new name, or add a name to an existing Topic?
Topic Map Server – Fragment Contexts • Goal : to indicate if the topic should be added to or added • Several options, • The ‘id’ property should contain a ‘internal system identity’ • A new property is defined on topic - isContextTopic • A special type is used within the topic • The system id exists as a subject address for this topic • All act to inform the processing tm server about what kind of add or remove to perform.
Example Add Add : <topic id=“t1”> <subjectIndentity>ontopia:system:topic:23</subjectIdentity> <baseName>Graham Moore</baseName> <baseName>gdm</baseName> </topic> <topic id=“t2”> <baseName>Lars Marius Garshol</baseName> </topic>