190 likes | 324 Views
Technology of the Semantic Web. Alan Ruttenberg Oral Diagnostic Sciences Clinical and Translational Data Exchange. Goals of this section. Discuss a number of core standards and technologies related to the Semantic Web URIs HTTP RDF OWL SPARQL Discuss several practices I’ve found useful
E N D
Technology of the Semantic Web Alan Ruttenberg Oral Diagnostic Sciences Clinical and Translational Data Exchange
Goals of this section • Discuss a number of core standards and technologies related to the Semantic Web • URIs • HTTP • RDF • OWL • SPARQL • Discuss several practices I’ve found useful • Adopt a common set of principles. Work with others • Be relentless in reuse • MIREOT • Build and release ontologies like software
Universal Resource Identifiers Globally unique identifier Namespace collision avoidance by mechanism of domain name ownership Many associated with network protocols (e.g. http) but some note (e.g. urn: ) HTTP URIs favored for semantic web usage. httpRange-14 an attempt to clarify that an HTTP URI can refer to anything (not just web pages)
The case for using URIs as identifiers and the http URI scheme • A global namespace promotes generic tools • Query, inference, cross-reference, data integration • URIs coordinate with web standards • Created with and for the Web • IETF and W3C recommended for naming • HTTP, HTML, RDF, OWL, SPARQL • http: URIs are universally understood • Most people will know what to do with an http: URI • http: URIs can identify anything • Not only a web page, but any kind of entity • http: URIs are as reliable as anything else • Durability doesn’t depend on protocol • http: URIs are not tied to the HTTP protocol • It is easy to be helpful and give back documentation about an entity when the URI identifier is put into a web browser.
Cool URIs don’t change To build a stable, predictable resources, we want people to be able to continue to access the same URI over time Changes in organizational structure, job, funding conspire to interfere with that Solution: Use two levels of indirection to a public redirection server (http://purl.org) to give us the option to fix things if it breaks.
OBO URI Strategycheap, relocatable = sustainable Prefix/Numerical Accession Network protocol To coexist with other purls Domain name we own. Currently synonym (CNAME) for purl.org * http://purl.org/obo/OBI_0000225 http://obofoundry.org/id-policy.shtml * Redirect somewhere else if purl.org goes belly up. http://purl.obofoundry.org/obo/OBI_0000225
RDF • An extremely simple language • No classes • All assertions are ‘triples’ subject predicate object. • Can’t be inconsistent (except in uninteresting case) • A well defined semantics (see http://www.w3.org/TR/rdf-mt/) Though that doesn’t prevent it from being misunderstood • Introduces model-theoretic semantics, interpretation, graphs • Provides a basis for semantic extensions
RDFS • Extends RDF with • Classes • Domain, range of properties • Subclass relation • Subproperty relation • A few data structures (bag,list etc. though problems) • Some standard annotation propertiessuch as rdfs:label, rdfs:comment … • No more inconsistencies than RDF
OWL • OWL is the first Semweb language that allows for inconsistencies. It allows work that is good enough to be wrong • Some aspects of OWL 2 • Fragment of first order logic • Profiles allow one to choose expressivity vs performance • Well documented conformance criteria • Maximum expressivity while still decidable • Allows metalogical statements (“think post-it notes”) about anything in the language, even expressions.
Relentless Reuse • OBO Foundry aims at building orthogonal interoperating resources • Reuse is successfully achieved by using the identifier from the source ontology in your own ontology. • See http://obofoundry.org/ OBI: cell CL: cell
There are many existing ontologies (but make sure they are good)
MIREOT • Minimimum Information to Reference an External ontology term • ~ copy/paste of terms into your own ontology • Terms in OBO Foundry ontologies stand on their own • If their meaning changes, they are deprecated => denotation of individual terms remain stable => they can be seen as individual units of meaning
MIREOT in practice xxx.owl Main ontology file, imports external and externalDerived.owl files. IMPORTS SCRIPT external.owl Contains minimal information about mapped classes externalDerived.owl Contains additional information about mapped classes
Have a deliberate release process The goal of a deliberate release process is to prepare a version that is aimed at users (which might be other developers or biologists) rather than the developers of this ontology Typical activities • Creation of dated release directory in repository • Copy project files to release directory • Update of MIREOT related files • Merge in one release file for ease of use • Run reasoner to add inferred axioms • OWL and OBO format • Quality checks • Create dated and "latest" PURLs as stable URIs for your ontology
Adopt shared principlesfrom http://www.obofoundry.org/wiki/index.php/Category:Accepted
Links http://www.w3.org/TR/webarch/ http://www.w3.org/wiki/AwwswHome http://www.ietf.org/rfc/rfc2396.txt to be superseded by http://tools.ietf.org/html/rfc3986 http://www.w3.org/TR/rdf-mt/ http://www.w3.org/TR/rdf-schema/ http://www.w3.org/TR/owl2-overview/ http://www.w3.org/wiki/AwwswHome