240 likes | 336 Views
Representing URI Resolution in OWL. Alan Ruttenberg (responsible for errors) Matthias Samwald Jonathan Rees. What goes wrong with URLs. The server disappears (s) The content disappears - 404 (c) The content might change and you want to know and communicate what it used to be (d)
E N D
Representing URI Resolution in OWL Alan Ruttenberg (responsible for errors) Matthias Samwald Jonathan Rees Alan Ruttenberg, Oct 3, 2006 HCLS F2F
What goes wrong with URLs • The server disappears (s) • The content disappears - 404 (c) • The content might change and you want to know and communicate what it used to be (d) • Access to the content is too slow (s) • Access to the content is too public (p) • The content is very big (b) • You don't know if a URI is an information resource or not (w) • You want to record and access metadata - information about some information resource - and you don't know where to get it. (m) • You don't know what format an information resource is encoded in. (f) Alan Ruttenberg, Oct 3, 2006 HCLS F2F
Existing proposals: LSIDs • Authority • “Location independence” • Data/Metadata distinction • Access method independence • Versioning • (similar: ARK, purl, DOI…) Alan Ruttenberg, Oct 3, 2006 HCLS F2F
Existing proposals: http-range14 • Use http. • Use result code to recognize potential non-information resources • Result code 2xx = information resource • Result code 3xx = any resource, pointer to more information • Result code 4xx unknown type • “303” to get more information about the thing Alan Ruttenberg, Oct 3, 2006 HCLS F2F
Existing proposals: Content negotiation • Agent asks for resource • Server responds with list of content types and where to get each • Agent chooses which to retrieve Alan Ruttenberg, Oct 3, 2006 HCLS F2F
“Short statements” John Barkley • Identify MIME Types of URIs. • Identify versions of URIs • URIs should dereference to something, even if it is only documentation,e.g., rdfs:comment • Use LSIDs Alan Ruttenberg, Oct 3, 2006 HCLS F2F
“Short statements” Phil Lord • URIs should identify one thing only. • URI allocation schemes should encourage stability over time. • Resources identified by URIs should be checksummable. Alan Ruttenberg, Oct 3, 2006 HCLS F2F
“Short statements” Matthias Samwald • Be careful what you are talking about - use separate names • Distinguish information resource from non-information resource • Distinguish making a query from resolution Alan Ruttenberg, Oct 3, 2006 HCLS F2F
“Short statements” David Booth (Information resources/metadata) • http://example.org/foo#bar might identify a thing(non-information resource), http://example.org/foo can be used to seek metadata about it. • Dereferencing non-information resources yields a http 303 result code Alan Ruttenberg, Oct 3, 2006 HCLS F2F
Selected issues with proposed solutions • http-range14 - “late” don’t know anything until you do the retrieval • Content negotion - confusion over what the thing is - e.g. foaf human readable document, rdf at same address. Try talking about the ugly font. • LSID - requires server deployment. Based on web services (slow). Unclear semantics of “versions”, “metadata”, “data” • No single proposal deals with all issues Alan Ruttenberg, Oct 3, 2006 HCLS F2F
An Alternative • Use the our SW tools to help solve this problem • Represent the information that you want to know about URIs in OWL. • Build an ontology to represent consensus/schema. • Take advantage of consistency checking, inheritence Alan Ruttenberg, Oct 3, 2006 HCLS F2F
Goals • Transparent/explicit. Contract based. • Adjustable • Extendable • Ontologically sound Alan Ruttenberg, Oct 3, 2006 HCLS F2F
Different things • The temperature of a patient (not an information resource) • A instrument that measures and reports temperature (not an information resource) • The record retrieved when you query the instrument (an information resource) • The record that you retrieved at a certain time and you copied and saved (an information resource) but related Alan Ruttenberg, Oct 3, 2006 HCLS F2F
A sketch • Some useful distinctions • Top level classes • Some properties • Only about instances (not properties or classes) Alan Ruttenberg, Oct 3, 2006 HCLS F2F
InformationResource NotAnInformationResource • Information resources are conceptually “Gettable” • They might not be able to be retrieved at a particular time • They might change • Ask yourself: “Would it be possible to get the thing itself over a network” • Disjoint Alan Ruttenberg, Oct 3, 2006 HCLS F2F
UnchangingInformationResourceEvolveableInformationResource • UnchangingInformationResource is like LSID “data”. A promise is made that the content will never change. • EvolveableInformationResource are resources that might change (even if we don’t want them to, e.g. NCBI gene records) • Disjoint Alan Ruttenberg, Oct 3, 2006 HCLS F2F
RetrievalMethod • A way to get an information resource. • Some examples • StandardURIRetrieval • TransformUriRetrieval http://genesdbs.org/entrez/7157 => http://cache.ibm.com/?generetrieve&id=http://genesdbs.org/entrez/ • SPARQLMethod => http://genesdbs.org/entrez/7157 => select ?dataFROM http://sparql.ibm.com/lifesciWHERE{ http://genesdbs.org/entrez/7157 :data ?data } • WebServiceMethodSupply a WSDL, name of parameter Alan Ruttenberg, Oct 3, 2006 HCLS F2F
RetrievalMethod (notes) • There may be more than one. • When more than one try them all in random order, or explicitly represent preference. • For company specific retrieval, add another RetrievalMerthod to an appropriate upper class (one more triple) Alan Ruttenberg, Oct 3, 2006 HCLS F2F
InformationResourceFormat • Explicitly give enough information to know what you will have to parse should you retrieve the resource (so you can choose whether or not to retrieve) • Like mime/type - BUT only the format, not the type (that’s for defining by class) • RDFXML • RDFTurtle • JPEG • TIFF • HTML • … Note: Different formats of “same” digital thing should be given different names. (but they may be related by some property, e.g. hasOtherRepresentation) Alan Ruttenberg, Oct 3, 2006 HCLS F2F
Some classes of InformationResource • XrayImage • Triples (rdf statements) • MedicalRecord • VersionInformation • ProtocolDocumentation • ProvenanceDescription • SPARQLEndpoint • …. • (ask me about metadata!) Alan Ruttenberg, Oct 3, 2006 HCLS F2F
Properties • Relate a NotInformationResource to an InformationResource - seeAlso, subjectOfMedicalRecord, foaf:homepage, biozen:d-data (‘described by data’) Alan Ruttenberg, Oct 3, 2006 HCLS F2F
Properties • Relating an informationResource to “metadata” • previousVersion (UnchangingInformationResource) • hasVersionDescription • hasChangeDescription • generatedBy (not an information resource) • cachedDate • hasMD5 (UnchangingInformationResource) Alan Ruttenberg, Oct 3, 2006 HCLS F2F
An example Class PartnersDigitalXray subclassOf NeverChangingInformationResource subclassOf DigitalXray mediaType hasValue jpegType retrievalmethod: hasValue webServiceMethod1002 http://partners.org/radiology/817277366 rdf:type PartnersDigitalXray mediaType jpegType (inherited) retrievalmethod webServiceMethod1002 (inherited) Alan Ruttenberg, Oct 3, 2006 HCLS F2F
Sharing bare URIs • Don’t, unless you have to. Generally messages should be a set of triples giving adequate information about type, resolution • If you do, use existing best practices to make them last, e.g. PURL Alan Ruttenberg, Oct 3, 2006 HCLS F2F