1 / 42

UKOLN is supported by:

An Introduction to Dublin Core Making Sense of Metadata, Society of Archivists EAD/Data Exchange SIG London, Thursday 17 November 2005 Pete Johnston Research Officer, UKOLN, University of Bath. UKOLN is supported by:. www.bath.ac.uk. An Introduction to Dublin Core. A brief history

emiko
Download Presentation

UKOLN is supported by:

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. An Introduction to Dublin Core Making Sense of Metadata, Society of Archivists EAD/Data Exchange SIG London, Thursday 17 November 2005 Pete Johnston Research Officer, UKOLN, University of Bath UKOLN is supported by: www.bath.ac.uk

  2. An Introduction to Dublin Core • A brief history • What is Dublin Core, really? • The DCMI Abstract Model • Encoding Dublin Core metadata • DC Application Profiles • DC in practice

  3. A Brief History

  4. A brief history (1) • Mid 1990s: rapid growth of World Wide Web • Challenge of resource discovery • search engines providing many hits, but little precision • recognition that library approach to cataloguing could not scale to Web resources • 1995 OCLC/NCSA Workshop in Dublin, Ohio • interdisciplinary consensus on 13 "metadata elements" • for discovery of "document-like objects" • relatively simple, usable by non-cataloguers • 1996 OCLC/CNI Workshop in Dublin, Ohio • expand to 15 elements • explicitly cross-domain • for discovery of broad range of "resources"

  5. Title Subject Description Creator Publisher Contributor Date The Dublin Core Metadata Element Set • Type • Format • Identifier • Source • Language • Relation • Coverage • Rights

  6. A brief history (2) • 1997-2000 Development of notion of "qualification" • tension between simplicity and complexity • element refinement • Narrow the meaning of a DC element • e.g. "date modified" v "date" • encoding scheme • Provide additional information about a value • e.g. that a subject is a Library of Congress Subject Heading • the "Dumb-Down" principle • Rules for transforming "qualified" description into "simple" description • the "One-to-One" rule • A DC description describes exactly one resource

  7. A brief history (3) • 1997-2000 What is a "resource"? • e.g. Can the DCMES be applied to people? • DCMI Type Vocabulary • Collection, Dataset, Event, Image (Still or Moving), Interactive Resource, Service, Software, Sound, Text, Physical Object • But still fairly non-prescriptive • 1998- Emergence of Resource Description Framework (RDF) • 2000-2001 "Grammatical Principles" as informal data model

  8. A brief history (4) • 2000-2005 Development of notion of DC "Application Profile" • tailoring metadata standards for context • providing local guidelines, constraints • combining components from different sources • 2003-2005 Formalisation of DCMI Abstract Model • concepts used in DC metadata • different types of terms used in DC metadata • how those terms used in combination to construct descriptions

  9. What is Dublin Core, really?

  10. Dublin Core is... • a conceptual framework/set of rules... • DCMI Abstract Model • describes how to use certain types of terms • ... to make statements... • ... that form descriptions (of resources) • a "core" vocabulary/set of terms... • managed by DCMI (Usage Board) • growing (relatively) slowly as new requirements arise • each identified by a Uniform Resource Identifier (URI) • a set of specifications for representing or encoding DC metadata descriptions in various formats

  11. DCMI Abstract Model(a slightly simplified view)

  12. DCMI Abstract Model • A description • describes exactly one resource • may specify a resource URI • consists of a set of statements

  13. Description Statement Resource URI DCMI Abstract Model: Descriptions

  14. DCMI Abstract Model • A statement must contain • a reference to a property • property URI • all DC "elements" are properties • properties may be defined by agencies other than DCMI • a reference to a second resource (value) • value URI, and/or • one or more value representations • value string • rich representation

  15. Description Statement Resource URI Property URI Value URI Property URI Value string Property URI Rich representation DCMI Abstract Model: Statements

  16. DCMI Abstract Model • A statement may contain • a reference to a vocabulary encoding scheme • vocabulary encoding scheme URI • type of value • a reference to a syntax encoding scheme • syntax encoding scheme URI • how value string is interpreted

  17. Description Statement Resource URI Property URI Value URI Vocab Enc Scheme URI Property URI Value string Syntax Enc Scheme URI Property URI Rich representation DCMI Abstract Model: Statements

  18. DCMI Abstract Model • A description describes one resource • Applications typically based on description sets • groups of descriptions • where the described resources may be related in some way • Description sets encoded or serialised as records • according to rules of binding

  19. Description Set Description Statement Resource URI Resource URI Property URI Property URI Value URI Value URI Vocab Enc Scheme URI Vocab Enc Scheme URI Property URI Property URI Value string Value string Syntax Enc Scheme URI Syntax Enc Scheme URI Property URI Property URI Rich representation Rich representation

  20. Encoding Dublin Core metadata(a very brief introduction!)

  21. DCMI Abstract Model and Bindings • For transfer between applications, descriptions must be represented as digital objects • Binding maps between constructs in conceptual model and components in a digital format • Two way • encoding application: description set -> record • decoding application: record -> description set • DCMI currently provides three "encoding guidelines" specifications • Other agencies may also provide bindings

  22. Property URI Value string Encoding Scheme URI Value URI Using X/HTML meta & link elements • The set of meta/link elements represent a single DC description. • The resource described is the X/HTML document in which the metadata is embedded. • Each meta/link element represents a single statement • Property and Encoding Scheme URIs encoded as prefixed names <link rel="schema.DC" href="http://purl.org/dc/elements/1.1/" /><link rel="schema.DCTERMS" href="http://purl.org/dc/terms/" /> <meta name="DC.title" content="A guide to DC metadata" /> <meta name="DCTERMS.audience" content="information managers" /> <meta name="DC.language" scheme="DCTERMS.ISO639-2" content="eng" /> <link rel="DCTERMS.references"href="http://dublincore.org/documents/dcq-html" />

  23. Property URI Value string Encoding Scheme URI Using the DC-XML format • Supports only limited subset of Abstract Model (revision forthcoming) • The container element, here <meta>, represents a single DC description. • Each child element represents a single statement • Property URIs and Encoding Scheme URIs encoded as XML QNames <?xml version="1.0"?><meta xmlns="http://www.ukoln.ac.uk/metadata/dcdot/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dc="http://purl.org/dc/elements/1.1/"> <dc:identifier>http://example.org/doc/1234/</dc:identifier> <dc:title>A Guide to DC Metadata</dc:title> <dc:language xsi:type="dcterms:ISO639-2">eng</dc:language> <dcterms:references>http://dublincore.org/documents/dcq-html</dcterms:references> </meta>

  24. Using the Resource Description Framework (RDF) • Specifications for DC in RDF do exist… • … but currently work in progress to • resolve ambiguities • revise in light of DCAM

  25. Dublin Core Application Profiles

  26. DC Application Profile • Implementers adapt metadata standards to the context of their application • Tension between localisation and interoperability • A DC Application Profile • specifies the terms (properties, vocabulary/syntax encoding schemes) used in a class of description sets • describes how those terms are used • supplementary information on how properties applied/interpreted in context • constraints on occurrence of properties • constraints on values and value representations (encoding schemes)

  27. DC Application Profiles: Examples • "Simple Dublin Core" • use of the 15 properties of the DCMES • all optional and repeatable • values represented by value strings • no vocabulary or syntax encoding schemes • UK eGMS • use of selected properties from DCMI vocabularies, additional properties • guidelines on use of properties • some properties mandated/recommended • some vocabulary encoding schemes mandated/recommended • guidance on content of value strings

  28. DC Application Profiles: Examples • JISC Information Environment Service Registry (IESR) Metadata Schema • supports description of several related resources (Collection, Service, Agents) • use of selected properties from DCMI vocabularies, selected properties from RSLP CD vocabularies, some properties created for IESR • for each subject resource type, guidelines on use of properties • some properties mandated/recommended • many vocabulary encoding schemes mandated/recommended

  29. DCMIProperties DC ApplicationProfile A: "Simple DC" DCMIVocabEncodingSchemes DC ApplicationProfile B: IESR IESRProperties IESRVocabEncodingSchemes

  30. DC in Practice

  31. Dublin Core in X/HTML • Initial implementation focused on DC-in-HTML • Robot crawls individual HTML pages to extract metadata • But today little/no use by large Web search engines • Problems of spamming/trust • Lack of take-up by authors/publishers • Success of full-text crawling/indexing, esp. Google! • However, some use in controlled domains • Intranets • Trusted groups of providers (e.g. eGMS) • Embedding DC in XHTML useful if you know a search engine exploits it

  32. Harvester HTTP GET Web Sites

  33. Picture Australia- images "related to all things Australian" from 40+ cultural agencies" • central search service based (initially at least) on crawling HTML-embedded DC metadata • providers migrating to OAI-PMH • currently hybrid approach? http://www.pictureaustralia.org/

  34. Dublin Core and OAI-PMH • Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) • Fairly simple mechanism for sharing metadata records between applications • Has origins in “e-prints” community • Built on HTTP, XML • Allows a harvester to ask a repository for all or some of its metadata records (in a specified metadata format) • i.e. supports "incremental harvesting" • "Give me all your records updated since yyyy-mm-dd" • "OAI-DC" (Simple DC) is mandatory format • But no limitation on format that can be transferred (as long as can be described by XML Schema)

  35. Harvester OAI-PMH Repositories

  36. OAIster (University of Michigan) • "academically-oriented digital resources" • "5,947,627 records from 557 institutions" (2005-11-15) http://oaister.umdl.umich.edu/

  37. Summary • DCMES/"Simple DC" as a "core" for discovery of wide range of resources • "Simple DC" is, by definition, simple! • Limitations in terms of functions/services that can be offered • DCMI Abstract Model provides a framework for extensibility and modularity • A DC Application Profile describes a real-world usage of that model

  38. An Introduction to Dublin Core Making Sense of Metadata, Society of Archivists EAD/Data Exchange SIG London, Thursday 17 November 2005 Pete Johnston Research Officer, UKOLN, University of Bath UKOLN is supported by: www.bath.ac.uk

More Related