160 likes | 325 Views
A Registry for Dublin Core. Thomas Baker, GMD IuK 2000: "Information, Knowledge and Knowledge Management Darmstadt, 27 March 2000. Metadata is a language. A metadata "sentence" might say: "This watercolor has Painter Joseph Beuys , Title Ohne Titel , and Date Painted 1959 .
E N D
A Registry for Dublin Core Thomas Baker, GMD IuK 2000: "Information, Knowledge and Knowledge Management Darmstadt, 27 March 2000
Metadata is a language • A metadata "sentence" might say: • "This watercolor has PainterJoseph Beuys, TitleOhne Titel, and Date Painted1959. • Dublin Core was designed as a simple metadata language -- a"pidgin" of general concepts for coarse-grained resource discovery. • In unqualified Dublin Core, the sentence above would say: • "This resource has CreatorJoseph Beuys, TitleOhne Titel, and Date1959.
Like languages, schemas evolve • Like words in languages, metadata terms may be coined, adopted, approved by official bodies, change meaning, or fall from use. • As in languages, need for simplicity is inevitably in tension with need for complexity • Dublin Core Element Set is almost always too simple to use "as is", so it is extended locally • If not managed, the proliferation of local extensions threatens interoperability in broader context
Registries as dictionaries • Metadata systems, like languages, need dictionaries for tracking usage and managing change • Like language dictionaries, registries can: • Prescribe good grammar and good usage guidelines • Describe how implementors are actually using metadata • Translate between natural languages • Define the "parts of speech" of metadata grammar -- building blocks of sentences
Requirements for the DCMI registry • Users and implementors need • a dictionary of terms • a place to publish project- or discipline-specific adaptations to share with colleagues and partners • Dublin Core Metadata Initiative needs • to manage its namespace (as a standards agency) • to provide machine-readable schemas for loadinginto editors and search engines • to provide crosswalks to related schemas • to link the (English) standard to translations in other languages (25 to date)
DCMI Registry Working Group • An RDF schema registry has already been deployed to support review processes within DCMI • Working Group: propose policy guidelines for managing the DCMI namespace with this registry • Later: encourage implementors of DC-related schemas (adaptations, profiles, translations) to put RDF schemas on the Web and link to them
RDF as a publication format for schemas • How standards are are currently defined: • in HTML pages and paper documents • no explicit hyperlinks between related elements in different standards • RDF Schema format (based on XML) • URIs provide cross-references to related schemas and documentation • on Web, browse related namespaces (and profiles) • richer thesaurus relations will support crosswalks between elements that are not exactly equivalent
Current prototype DCMI registry • Based on MyRDF Toolkit • Eric Miller, Online Computer Library Center, Dublin (Ohio) and RDF Working Group of W3C • To define, search, and navigate among distributed collections of RDF schemas • Works like the Web itself: to add a schema, make it available in RDF on the Web and create hyperlinks to and from that schema • Points towards a scalable ecology of metadata registries on the Web
Linking multiple translations of a standard • Model: Multilingual Dublin Core Project • University of Library and Information Science, Tsukuba, Japan • Uses RDF schemas to share machine-readable tokens for translations of DC terms in Japanese, Arabic, Punjabi (26 languages to date) • Java applets for displaying fonts • Raises policy questions: How can we manage the evolution of Dublin Core as a multilingual standard? How can other language communities help shape the global (English) standard?
Tracking linguistic variation and equivalences • Model: MetaForm Project • State and University Library, Goettingen • Local "manifestations" of Dublin Core for specific projects introduce variations -- like "dialects" • "Crosscuts" -- how are elements used in different implementations? • Provides "mappings" and "crosswalks" between Dublin Core and other schemas of similar scope • Demonstrates the sort of output one would want from queries to a distributed registry
Mapping between schemas via an interlingua • Model: DESIRE Project Metadata Registry • UK Office of Library and Information Networking, Bath • Maps various schemas to a core of shared concepts (interlingua) • In DESIRE Registry, based on ISO Basic Semantic Registry • Suggests how interoperability might be achieved among multiple schemas in a scalable manner (n-to-1 instead of n-to-n)
Namespaces versus Profiles • Implementors usually need to "mix and match" • use parts of one standard with parts of another • coin some local terms to fill in gaps • Application profiles (DESIRE Project) • schemas are defined in namespaces • namespace semantics reused in application profiles • Registries should include application profiles • expressible using RDF schema format • will to help implementors learn from peers
Annotation vocabularies • Registration authorities or third parties layer annotations on metadata schemas or elements • For example, DCMI could "recommend" an element or qualifier -- whether it is in DCMI's own namespace or elsewhere • RDF schemas support this • Supports notion of "publish first, filter later" (Wilensky talk this morning)
DCMI Usage Committee • Reviews elements and qualifiers in light of grammatical principle • Levels of annotation (under discussion) • local terms in use by projects • terms proposed for review by Usage Committee • terms found conforming to grammar principles • terms recommended by Usage Committee • terms that have become obsolete • Versioning and life-cycle of terms (under discussion)
Metadata grammar • DCMI grammar: Elements and Qualifiers • Resources (Web pages, books, museum objects) have things like Creators and Titles -- Elements • Elements are modified by Qualifiers (adjectives) • RDF grammar: Resources, Relations, Classes • RDF and DC grammars are close conceptually, though terminologies differ • Using RDF schemas for the DCMI registry is helping clarify differences and harmonize grammars