170 likes | 396 Views
Dublin Core Qualifiers and A Grammar for Dublin Core. Thomas Baker DC-8, National Library of Canada, Ottawa 4 October 2000. A pidgin for digital tourists. Metadata is language DC: small language -- pidgin -- for searching across domains using a few familiar attributes
E N D
Dublin Core Qualifiers andA Grammar for Dublin Core Thomas Baker DC-8, National Library of Canada, Ottawa 4 October 2000
A pidgin for digital tourists • Metadata is language • DC: small language -- pidgin -- for searching across domains using a few familiar attributes • "Pidginization": tourists learning simple phrases to order beer in an unfamiliar language • We are all "tourists" on the global Internet. • A pidgin for using the Web to find resources across multiple domains.
A grammar of Dublin Core • http://www.gmd.de/People/Thomas.Baker/DC-Grammar.html • By design not as subtle as mother tongues, but easy to learn and extremely useful in practice • Pidgins: small vocabularies (Dublin Core: fifteen special nouns, lots of optional adjectives) • Simple grammars: sentences (statements) follow a simple fixed pattern...
implied verb one of 15 properties property value (an appropriate literal) DC:Creator DC:Title DC:Subject DC:Date... implied subject Resource has property X qualifiers (adjectives) [optional qualifier] [optional qualifier]
Resource has Subject "Languages -- Grammar" LCSH Resource has Date "2000-06-13" ISO8601 Revised
Element Refinements • Make the meaning of an element narrower or more specific. • a Date Created versus a Date Modified • an IsReplacedBy Relation versus a Replaces Relation • A refined element shares the meaning of the unqualified element, but with a more restricted scope. • A client that does not understand a specific element refinement term should be able to ignore the qualifier and fall back on the broader meaning of the element.
Value Encoding Schemes • Pointers to standard encoding schemes that help interpret or parse an element value • Says that the value is • a term selected from a controlled vocabulary (e.g., Library of Congress Subject Headings) • a string formatted in a standard way (e.g., "2000-01-01" as an ISO8601 expression of a date) • If an encoding scheme is not understood by a client or agent, the value should still be "appropriate" and usable for discovery. • Even if its scheme is unknown, a value should not be misleading.
Dumb-Down Principle for qualifiers • The fifteen elements should be usable and understandable with or without the qualifiers • Like saying that nouns can stand on their own without adjectives • If your search engine encounters an unfamiliar qualifier, look it up somewhere -- or just ignore it! • To test whether a qualifiers are "good", cover the qualifiers with your hand and ask: • Does the statement still make sense? • Is it correct?
Resource has Subject "Languages -- Grammar" LCSH Resource has Date "2000-06-13" ISO8601 Revised
Review and approval status • DCMI Usage Committee reviews proposals for qualifiers • Evaluates proposals in light of grammatical principles (are the qualifiers ignorable?) • Tiered model of approval status (tentative): proposed, conforming, recommended, obsolete
History of a new process • 1998, Nov: DC-7, element-specific working groups form to propose useful qualifiers • 1999, Oct: DC-8, breakout groups review proposals against principles; the Usage Committee forms • 2000, Jan: intended deadline passes • 2000, Feb: formal principles reformulated in Usage Committee • 2000, Jul: first qualifiers are published
A not-so-good example Resource has Creator "Last.name: Smith First.name: John Type: Person Affiliation: IBM"
Value Components: proposed as a third type of qualifier -- and rejected Information resource HASA HASA HASA HASA Property Property Property Property Creator Title Type Date "J.Smith" HASA HASA VC VC Type Affiliation "Person" "IBM"
Perhaps we should move them to... Information resource HASA HASA HASA HASA Element Element Element Element Creator Title Type Date "J.Smith" HASA HASA VC VC Type Affiliation "Person" "IBM"
...a Core Element Set for Agents? Person (or Corporation) HASA HASA HASA HASA Property Property Property Property Name Name Affiliation Type Birthdate Value "J.Smith" "IBM" "Person" "1947-03-30"
Richer metadata languages • Complexification compromises the pidgin • Richer vocabularies and grammatical structures needed for describing multiple related entities (resources, the people who created them, their life-cycle events...) • Plenary II: Modular management of complexity • Application profiles • Structured metadata for richer description • DCMI Architecture Working Group