1 / 39

Pete Johnston UKOLN, University of Bath Bath, BA2 7AY

RDF, XML and interoperability Managing networks : understanding new technologies, Birmingham, 13 September 2001. Email p.johnston@ukoln.ac.uk URL http://www.ukoln.ac.uk/. Pete Johnston UKOLN, University of Bath Bath, BA2 7AY. UKOLN is supported by:. RDF, XML & interoperability.

taya
Download Presentation

Pete Johnston UKOLN, University of Bath Bath, BA2 7AY

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. RDF, XML and interoperabilityManaging networks : understanding new technologies, Birmingham, 13 September 2001 Email p.johnston@ukoln.ac.uk URL http://www.ukoln.ac.uk/ Pete Johnston UKOLN, University of Bath Bath, BA2 7AY UKOLN is supported by:

  2. RDF, XML & interoperability • Metadata : a reprise • Communities, communication & XML • An introduction to RDF • RDF, XML and interoperability Managing networks: understanding new technologies, Birmingham, 13 Sep 2001

  3. What is metadata? • “Data associated with objects which relieves their potential users of having to have full advance knowledge of their existence or characteristics. A user might be a program or a person.” • Dempsey and Heery, 1998 • “Machine understandable information about web resources or other things.” • Berners-Lee, 1997 • Structured data about resources that can be used to help support a wide range of operations Managing networks: understanding new technologies, Birmingham, 13 Sep 2001

  4. HTML documents digital images databases books museum objects archival records metadata records collections services physical places people abstract “works” concepts events What resources, objects, things? Managing networks: understanding new technologies, Birmingham, 13 Sep 2001

  5. What operations? • User wants to • find, identify, select, obtain / use • Owner / manager / provider wants to • describe • enable and control access/use • administer • Different “flavours” of metadata serve different purposes • Simple, generic vs. rich, specific Managing networks: understanding new technologies, Birmingham, 13 Sep 2001

  6. Communities & communication • Effective transmission of information requires agreement on • semantics • what terms mean • e.g. “cat”, “to sit”, “mat” • structure • significance of arrangement of terms • e.g. sentence: subject -> verb -> object (in English….) • syntax • rules of expression • “The cat sat on the mat.” • A resource description community is defined by consensus on conventions Managing networks: understanding new technologies, Birmingham, 13 Sep 2001

  7. Communication using XML (1) • An example • I prepare a music catalogue using the (imaginary!) AlbumCat XML schema • I publish my XML document on the Web • someone else prepares a catalogue using the same XML schema and publishes their XML document • I can read their XML document and locate tracks created by Don Van Vliet in their catalogue • But more importantly….. Managing networks: understanding new technologies, Birmingham, 13 Sep 2001

  8. Managing networks: understanding new technologies, Birmingham, 13 Sep 2001

  9. Managing networks: understanding new technologies, Birmingham, 13 Sep 2001

  10. User request: Find identifiers of all tracks with creator “Don Van Vliet” Program action: Find values of dc:identifier attributes of track elements which have a dc:creator child element with content “Don Van Vliet” Communication using XML (2) … my software can search their document because I have programmed it to map: Managing networks: understanding new technologies, Birmingham, 13 Sep 2001

  11. Program action: Find values of dc:identifier attributes of track elements which have a dc:creator child element with content “Don Van Vliet” <catalogue> <album dc:identifier="http://pj.org/album/245"> <dc:title>The Spotlight Kid</dc:title> <dc:creator>Van Vliet, Don</dc:creator> <track dc:identifier="http://pj.org/track/723"> <dc:title>Grow fins</dc:title> <dc:creator>Van Vliet, Don</dc:creator> </track> </album> </catalogue> • Program action: • Find • values of dc:identifier attributes • of track elements • which have a dc:creator child element • with content “Don Van Vliet” • Program action: • Find • values of dc:identifier attributes • of track elements • which have a dc:creator child element • with content “Don Van Vliet” • Program action: • Find • values of dc:identifier attributes • of track elements • which have a dc:creator child element • with content “Don Van Vliet” • <catalogue> • <album dc:identifier="http://pj.org/album/245"> • <dc:title>The Spotlight Kid</dc:title> • <dc:creator>Van Vliet, Don</dc:creator> • <track dc:identifier="http://pj.org/track/723"> • <dc:title>Grow fins</dc:title> • <dc:creator>Van Vliet, Don</dc:creator> • </track> • </album> • </catalogue> • <catalogue> • <album dc:identifier="http://pj.org/album/245"> • <dc:title>The Spotlight Kid</dc:title> • <dc:creator>Van Vliet, Don</dc:creator> • <track dc:identifier="http://pj.org/track/723"> • <dc:title>Grow fins</dc:title> • <dc:creator>Van Vliet, Don</dc:creator> • </track> • </album> • </catalogue> • <catalogue> • <album dc:identifier="http://pj.org/album/245"> • <dc:title>The Spotlight Kid</dc:title> • <dc:creator>Van Vliet, Don</dc:creator> • <track dc:identifier="http://pj.org/track/723"> • <dc:title>Grow fins</dc:title> • <dc:creator>Van Vliet, Don</dc:creator> • </track> • </album> • </catalogue> Communication using XML (3) Managing networks: understanding new technologies, Birmingham, 13 Sep 2001

  12. Metadata use • Resource users wish to • search across the boundaries of communities • combine resources from different communities • Resource providers wish to • exchange descriptions with members of other communities • Third parties wish to • describe resources owned/described by others • Metadata is • used beyond its creator community • combined with metadata from other communities Managing networks: understanding new technologies, Birmingham, 13 Sep 2001

  13. Communication using XML (4) • Continuing the example • a museum describes their holdings using the (imaginary...) ArtCat XML schema and publishes their XML document • I can read their XML document and locate pictures created by Don Van Vliet listed in their catalogue • requires my guesswork and/or reference to semantics of ArtCat schema • But…. Managing networks: understanding new technologies, Birmingham, 13 Sep 2001

  14. Managing networks: understanding new technologies, Birmingham, 13 Sep 2001

  15. User request: Find identifiers of all “works” with creator “Don Van Vliet” Program action (AlbumCat): Find values of dc:identifier attributes of track elements which have a dc:creator child element with content “Don Van Vliet” Communication using XML (5) … to search across both catalogues, my software now has to be programmed with two mappings: Program action (ArtCat): Find content of dc:identifier elements which have a picture parent element with a details child element which has a dc:creator child element with content “Don Van Vliet” Managing networks: understanding new technologies, Birmingham, 13 Sep 2001

  16. The problem • Statement • this resource (track, picture... etc!) has dc:creator “Don Van Vliet” • Multiple expressions in XML • different XML schemas make different choices • all “good” (and valid) • human reader of document can interpret (maybe) • program needs prior “knowledge” of structural conventions in each XML schema • Not scalable in an “open” environment • how to manage ever increasing set of conventions • always encountering unknown schemas Managing networks: understanding new technologies, Birmingham, 13 Sep 2001

  17. The problem (2) • “XML allows users to add arbitrary structure to their documents but says nothing about what the structures mean.” • Berners-Lee, 2001 • Consensus on syntax • use of XML • Consensus on semantics of terms • meaning of (uniquely named through XML namespace) elements/attributes • No consensus on meaning of structure • e.g. parent-child element relations Managing networks: understanding new technologies, Birmingham, 13 Sep 2001

  18. Introducing RDF • Resource Description Framework Model & Syntax • Recommendation of W3C, 1999 • Generic “architecture” for metadata • set of conventions for applications exchanging metadata • allow semantics to be defined by different resource description communities • accommodate mixing of metadata from diverse sources Managing networks: understanding new technologies, Birmingham, 13 Sep 2001

  19. Introducing RDF (2) • Defines • model for making statements about resources • conventions for encoding statements using XML syntax • Object types • Resource : any object identified by URI • not necessarily accessible via Web • Property : “attribute” to describe resource • properties also uniquely identified by URI • Statement : “triple” of specific resource, named property, and value Managing networks: understanding new technologies, Birmingham, 13 Sep 2001

  20. The RDF model • A resource has some property whose value is either (i) a simple string value (literal)… http://pj.org/doc/1 author Pete • The resource identified by the URI http://pj.org/doc/1 has a property “author” whose value is “Pete” • Or, “Pete” is the “author” of the resource identified by http://pj.org/doc/1 Managing networks: understanding new technologies, Birmingham, 13 Sep 2001

  21. The RDF model (2) • … or (ii) another resource... http://pj.org/doc/1 author name email Pete pete@pj.org • The value of property “author” is another resource which has a property “name” with value “Pete” and a property “email” with value “pete@pj.org” Managing networks: understanding new technologies, Birmingham, 13 Sep 2001

  22. The RDF model (3) • … which may itself have a URI author http://pj.org/doc/1 http://pj.org/person/pete name email Pete pete@pj.org Managing networks: understanding new technologies, Birmingham, 13 Sep 2001

  23. The power of RDF • Extensible model • supports any vocabularies • Supports arbitrary complexity of description • URIs as unique fixed points to identify • resources • properties • Descriptions created independently can be “merged” using URIs as “anchors” Managing networks: understanding new technologies, Birmingham, 13 Sep 2001

  24. First source author http://pj.org/doc/1 http://pj.org/person/pete name email Pete pete@pj.org Managing networks: understanding new technologies, Birmingham, 13 Sep 2001

  25. Second source http://pj.org/doc/1 subject XML Managing networks: understanding new technologies, Birmingham, 13 Sep 2001

  26. Third source organisation http://pj.org/person/pete UKOLN Managing networks: understanding new technologies, Birmingham, 13 Sep 2001

  27. XML UKOLN subject organisation author http://pj.org/doc/1 http://pj.org/doc/1 http://pj.org/person/pete http://pj.org/person/pete name email Pete pete@pj.org Three descriptions merged Managing networks: understanding new technologies, Birmingham, 13 Sep 2001

  28. The RDF XML syntax • XML representation of model • to store/exchange descriptions • Property names made unique through use of XML namespaces • Variant XML syntaxes for RDF <rdf:RDF xmlns:uc=“http://www.ukoln.ac.uk/core/”> <rdf:Description about=”http://pj.org/doc/1”> <uc:author>Pete</uc:author> </rdf:Description> </rdf:RDF> Managing networks: understanding new technologies, Birmingham, 13 Sep 2001

  29. The RDF XML syntax (2) • Using RDF/XML syntax means accepting conventions for the meaning of structures in XML document • So, an RDF/XML processor can “know in advance” the meaning of structures • even if the description uses unanticipated vocabularies • “partial understanding” • Can read multiple descriptions into store and “merge” on URIs • Will be generated/consumed by software! Managing networks: understanding new technologies, Birmingham, 13 Sep 2001

  30. First source author http://pj.org/doc/1 http://pj.org/person/pete name email Pete pete@pj.org <rdf:RDF xmlns:uc=“http://www.ukoln.ac.uk/core/”> <rdf:Description about=“http://pj.org/doc/1”> <uc:author> <rdf:Description about=“http://pj.org/person/pete”> <uc:name>Pete</uc:name> <uc:email>pete@pj.org</uc:email> </rdf:Description </uc:author> </rdf:Description> </rdf:RDF> Managing networks: understanding new technologies, Birmingham, 13 Sep 2001

  31. Second source http://pj.org/doc/1 subject XML <rdf:RDF xmlns:uc=“http://www.ukoln.ac.uk/core/”> <rdf:Description about=”http://pj.org/doc/1”> <uc:subject>XML</uc:author> </rdf:Description> </rdf:RDF> Managing networks: understanding new technologies, Birmingham, 13 Sep 2001

  32. Third source organisation http://pj.org/person/pete UKOLN <rdf:RDF xmlns:uc=“http://www.ukoln.ac.uk/core/”> <rdf:Description about=”http://pj.org/person/pete”> <uc:organisation>UKOLN</uc:organisation> </rdf:Description> </rdf:RDF> Managing networks: understanding new technologies, Birmingham, 13 Sep 2001

  33. Three descriptions merged <rdf:RDF xmlns:uc=“http://www.ukoln.ac.uk/core/”> <rdf:Description about=“http://pj.org/doc/1”> <uc:author> <rdf:Description about=“http://pj.org/person/pete”> <uc:name>Pete</uc:name> <uc:email>pete@pj.org</uc:email> <uc:organisation>UKOLN</uc:organisation> </rdf:Description </uc:author> <uc:subject>XML</uc:subject> </rdf:Description> </rdf:RDF> Managing networks: understanding new technologies, Birmingham, 13 Sep 2001

  34. A Dublin Core description • <?xml version="1.0"?> • <rdf:RDF • xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" • xmlns:dc="http://purl.org/dc/elements/1.1/"> • <rdf:Description about="http://www.ukoln.ac.uk/"> • <dc:title>UKOLN home page</dc:title> • <dc:creator>Web-support Team, UKOLN</dc:creator> • <dc:subject>digital information management; metadata</dc:subject> • <dc:description>The home page of the UKOLN web site. UKOLN is a • national focus of expertise in digital information management. It • provides policy, research and awareness services to the UK library, • information and cultural heritage communities. UKOLN is based at the • University of Bath.</dc:description> • <dc:publisher>UKOLN</dc:publisher> • <dc:date>2001-09-06</dc:date> • <dc:type>Text</dc:type> • <dc:format>text/html</dc:format> • <dc:format>12809 bytes</dc:format> • </rdf:Description> • </rdf:RDF> Managing networks: understanding new technologies, Birmingham, 13 Sep 2001

  35. RDF, XML & interoperability • Why isn’t XML enough? • simple statement could be expressed in XML in many different ways • human reader makes interpretation/guess • application program requires prior knowledge of schema/DTD design • RDF/XML • imposes extra syntactic constraints on how statement expressed • both human and program can interpret description consistently • Less flexibility, greater interoperability Managing networks: understanding new technologies, Birmingham, 13 Sep 2001

  36. RDF, XML & interoperability • Tentatively…. • Use XML for exchange when • partners (humans, applications) both “know” semantics conveyed by structure of (meta)data • Use RDF/XML for exchange when • (meta)data potentially used by applications without prior “knowledge” of specific schema • (meta)data incorporates overlapping structures from different domains • N.B. raises issues of trust • who made statements? Managing networks: understanding new technologies, Birmingham, 13 Sep 2001

  37. A note of caution • RDF not (yet?) a widely adopted technology • Addresses cross- organisation/domain problems • Some scepticism? • perceived as theoretical, “academic”? • also considerable enthusiasm! • Some revisions to Model & Syntax in progress at W3C • XML 1.0 is stable • RDF less so • Limited tools available (at present!) • But also growing number of applications Managing networks: understanding new technologies, Birmingham, 13 Sep 2001

  38. Exercise (optional) • DC-dot • http://www.ukoln.ac.uk/metadata/dcdot/ • Web-based tool • generates DC metadata for Web pages, based on existing <meta> tags, heading content etc • Experiment with DC-dot to generate DC metadata for pages of your choice • View the RDF/XML representations Managing networks: understanding new technologies, Birmingham, 13 Sep 2001

  39. Acknowledgements • UKOLN is funded by Resource: the Council for Museums, Archives and Libraries, the Joint Information Systems Committee (JISC) of the UK higher and further education funding councils, as well as by project funding from the JISC and the European Union. UKOLN also receives support from the University of Bath where it is based. • http://www.ukoln.ac.uk/ Managing networks: understanding new technologies, Birmingham, 13 Sep 2001

More Related