460 likes | 639 Views
CIMI: Consortium for the Interchange of Museum Information. Dublin Core (DC) Metadata Testbed Lynn Ann Underwood July 1999 Museum Records Manager Solomon R. Guggenheim Museum. What is CIMI?.
E N D
CIMI: Consortium for the Interchange of Museum Information Dublin Core (DC) Metadata Testbed Lynn Ann Underwood July 1999 Museum Records Manager Solomon R. Guggenheim Museum
What is CIMI? • “A group of institutions and organizations that encourages an open standards-based approach to the management and delivery of digital museum information.” • Formed 1990. • Recent Projects: • Z39.50 • IIM (Integrated Information Management) • Dublin Core (DC) Metadata Testbed
Metadata? What Are We Talking About? • Metadata is a fashionable term. • Used to describe People, Places, & Objects (Resources). • Structured data about data. • Cataloguing, indexing, documentation is one type of Metadata. • Commonly associated with electronic and networked information. • Databases & Web Pages • CIMI’s definition acknowledges museums document objects/items, collections, programs, staff, etc.. • Purpose for CIMI is information retrieval.
How is Metadata Used? • Information Retrieval • Fielded searching facilitates resource discovery. • Document Administration • Rights Management • Sales & Service • Security & Authentication • Archival Status
Metadata as part of a Resource Description Community • A resource description community is characterized by common semantic, structural and syntactic conventions used for the exchange of information. • Through the use of detailed standards MARC & AACR2 the library community promotes interoperability. • While the art community formed the Art & Architecture Thesaurus (AAT) and the Categories for the Description of Works of Art (CDWA), specifically the art museum community can use these in addition to metadata to share resources.
Why Use Dublin Core? • A useful tool to refine web searching. • Repurpose information that already exists. • It is easier to adopt an interdisciplinary standard already in use. • Interoperability: Allows different communities (libraries, archives, businesses, museums, etc.) to search for data using a common basis. • Establishes a basis for next-generation projects.
Resource Description Communities (e.g. DC, AACR2) HTML MARC RDF (XML) Interoperability • Semantics • The meaning of the elements • Structure • human-readable • machine-parseable • Syntax • grammars to convey semantics and structure
Title Creator Subject Description Publisher Contributor Date Type Format Identifier Source Language Relation Coverage Rights The Dublin Core
DC “Simple” • “Simple” or unqualified DC is comprised of the 15 elements with no further content definition. • Current “simple” definitions are based on IETF (Internet Engineering Task Force) RFC 2413 document. • The CIMI working group resisted the temptation to move directly to qualified DC. • Instead CIMI rigorously tested DC “Simple” and it is considered the primary application testing “Simple”. • This process heightened the group’s awareness for the need for qualifiers (element & value).
DC Qualified • Qualified adds descriptive precision in retrieving a resource. This is achieved through the development of a substructure. For instance “Role” is a desired term to further describe, or “qualify”, the CREATOR element. • Creator=Name.Creator Role=Artist • Qualified also allows for terms to be drawn from controlled vocabularies (LCSH, AAT) or classification schemes (DDC). The use of hierarchies provides further definition (semantic specificity). • Guggenheim family -- art patronage • Caution of using DC Qualified is that elements must degrade gracefully to preserve interoperability.
DC Qualified • DC Qualified is a currently under development by DC Working groups. • Working Groups: • DC- Agents (Creator, Contributor, Publisher) • DC-Coverage • DC-Date • DC-Format • DC-Relation (Source, Relation) • DC-Subdesc (Subject Description, Language) • DC-Title (Title, Identifier) • DC-Type • *no working group for rights
DC Requirements • All 15 DC elements are optional. • All 15 DC elements may be repeated. • Proposed changes to the 15 core elements must be made through the framework of the DC working group.
DC Requirements1:1 Principal • “...one object (or collection), resource, or instantiation can only be described within a single metadata record.” • 1:1 is not formally adopted. • This principal, along with the DC Type field, assists with description of the resource. • RDF (Resource Description Framework) reinforces the 1:1 rule.
XML: eXtensible Markup Language • Based on SGML. • Encoding syntax. • Tools under development.
RDF: Resource Discovery Framework • A scaleable or “extensible” data model. • It provides a framework for exchanging different types of metadata. • Types of Metadata (GLIS, INDECES, IMS) • Intended to be machine generated and understandable. • The Request for Comment (RFC) was announced in March 1999
The Dublin Core Serves as a Filter DC.title DC.creator DC.subject DC... A User A Resource Dublin Core‘filter’ mapping/ crosswalk
Using DC “Simple”, we can map data from detailed records directly to the Dublin Core. Artist’s Name Type of Work Period depicted Place depicted ... Creator Subject Coverage ... Surname Forename Title ...
Why DC for Museums • Museum community requires a method to access databases with different underlying schemas because the community historically lacks content standards. • Web provides museums with an opportunity to share with other museums, libraries, archives, individuals, through the use of commonly understood semantics.
What is Museum Specific? • Emphasis on attributes of physical objects. • Associate physical object with persons, places, and events. • Need to describe items, collections, institutions, people, and events. • Need to account for surrogates such as photographs.
CIMI Assumptions for Museums • DC is appropriate for use in describing both physical and digital resources. • DC is easy to learn and simple to use: Is it usable by non-cataloguers? • Information can be meaningfully and efficiently extracted from existing museum systems in order to populate DC records. • The creation of a DC record to describe a museum is cost-effective. • DC aids the discovery of resources more than access to the underlying Collection Management System might.
CIMI Identifies DC Challenges for Museums • Tension: functionality and simplicity. • Tension: extensibility and interoperability. • Human and machine creation and use. • Community-specific functionality, creation, administration, access.
Testbed Participants • Involvement of over 18 participants both 1998 & 1999. • Access Providers • Software Vendors • Technical Support Personnel • Content Providers • Cultural Heritage • Art • Natural History
Guggenheim Records • The Guggenheim has approximately 5,600 records in an Access database. • Of the 15 DC Elements only a handful could be mapped.
Guggenheim Records • Due to the fact that Guggenheim records scarcely populated the 15 DC elements, my methodology to test DC elements was to build 134 records from scratch. • This process of creating more robust records helped identify documentation projects, such as the addition of subject terms, etc. • It also helped address information integration issues within the museum.
Guggenheim Records • Creating Object, Collection, Institution, & Event records required information to be brought together from different departments. • For object records I combined information from the database with data from the curatorial and registrar files. • Data for collection records was drawn from electronic and paper files in addition to our web site. • Institution records were created using our web site and print catalogue information. • For event records I used exhibition publications, brochures, and our web site.
Guggenheim Contribution • The 134 full or “rich” records describe individual artworks, collections, the museum, and events. • Also contributed were over 5,600+ collection records exported from the collection database. • Intended to be an exporting routine, most museums may find, as we did, that their DC records are not very robust. • By providing the testbed with both rich and sparse records further user testing will benefit.
Testbed Products • Guide to Best Practice: Dublin Core • http://www.cimi.org/documents/meta_bestprac>VO31.html • Drafted Winter 1998 • Peer Review Spring 1999 • Published Summer 1999 • Over 300,000 record repository • Contains museums, collections, artifacts • DC “Simple” records both created by hand or exported from legacy systems.
DC is (sort of) easy to use. DC works for museum information. DC is a machete, not a Scalpel. Further evaluation is necessary. Need to express more complexity. Can be mapped to other standards. Community will require guidance. 15 “simple” elements will work for museum data. Lose ability to express complexities (dates). Non-intuitive fielding of information (materials, methods, techniques, and creators of surrogates.) Outcomes
Outcomes: CIMI Institute • Responses included: • Need for more concrete examples, DC, XML, RDF. • Would like guidance on how to implement including storage strategies for archiving, retrievablity and architecture. • Fuller description of tools. • More discussion on cost. • Practical examples from the end user’s perspective. What does this look like to the user who is searching for the resource (delivery mechanism).
Summary • DC is useful for museum information needs. • Qualification of DC is developing. • Web Infrastructure is developing (HTML, XML, RDF). • Tools are beginning to appear and evolve. • Interoperability testbeds are underway.
WWW Infrastructure Evolving • Resource Description Framework (RDF) • will allow rich metadata semantics for documents • http://www.w3.org/RDF/ • Extensible Markup Language (XML) • will allow highly structured documents and rich linking (relationship) capabilities • http://www.w3.org/XML/ • Uniform Resource Names (URNs) • will allow for persistent, globally unique identifiers
Resources • DC Home Page • http://purl.org/dc • Metadata Matters • http://www.nla.gov.au/meta • IFLA Metadata Resources page • http://www.ifla.org/II/metadata.ht. • Dlib Magazine (all DC workshop reports)
Resources • Dublin Core Homepage • http://purl.org/dc • Proposed Recommendation of the DC Metadata Initiative • http://purl.org/dc/elements/1:1 • Modifications to this document will replace RFC 2413 • RFC 2413 • http://www.ietf.org/rfc/rfc2413.txt
Resources: Metadata Tools • DC Dot (UKOLN) • http://www.ukoln.ac.uk/metadata/dcdot • Reggie (DSTC) • http://metadata.net • The aim of the Reggie Metadata Editor is to enable the easy creation of various forms of • metadata with the one flexible program. As it stands, the Reggie applet can create metadata using the • HTML 3.2 standard, the HTML 4.0 standard, the RDF (Resource Description Framework) format • and the RDF Abbreviated format.
Resources: Metadata Tools • Nordic DC Metadata Template • http://www.lub.lu.se/cgi-bin/nmdc.pl • CORC (OCLC) • http://purl.oclc.org/corc
Resources: Metadata Tools • SEED (Search Engine Evaluation & Development), University of Wolverhampton • Researched the automatic classification of web pages, initial work focused on Dewey Decimal Classification • http://scitsd.wlv.ac.uk:8080/metadata.html
DC Dot Dublin Core Generator <link rel="schema.DC" href="http://purl.org/dc"> <meta name="DC.Title" content="GUGGENHEIM MUSEUMS"> <meta name="DC.Publisher" content="CERFnet"> <meta name="DC.Type" content="Text"> <meta name="DC.Format" content="text/html"> <meta name="DC.Format" content="550 bytes"> <meta name="DC.Identifier" content="http://www.guggenheim.org">
DC Dot Dublin Core Generator: RDF <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dc="http://purl.org/dc/elements/1.0/"> <rdf:Description about="http://www.guggenheim.org" dc:title="GUGGENHEIM MUSEUMS" dc:publisher="CERFnet" dc:type="Text" > <dc:format> <rdf:Bag rdf:_1="text/html" rdf:_2="550 bytes" /> </dc:format> </rdf:Description> </rdf:RDF>
DC Dot Guggenheim Enhanced (1 of 2) <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dc="http://purl.org/dc/elements/1.0/"> <rdf:Description about="http://www.guggenheim.org Solomon R. Guggenheim Museum" dc:title="Solomon R. Guggenheim Museum" dc:creator="Guggenheim, Solomon R." dc:subject="Bilbao, Spain Berlin, Germany New York, New York, USA Venice, Italy Guggenheim, Solomon R. artworks Krens, Thomas Kandinsky, Wassily Brancusi, Constantin Calder, Alexander Chagall, Marc Delaunay, Robert Klee, Paul Miro, Joan Picasso, Pablo Hilla von Rebay Foundation Museum of Nonobjective Painting Thannhauser, Justin K. Thannhauser, Hilde Guggenheim, Peggy Peggy Guggenheim Collection Panza di Biumo, Giuseppe Robert Mapplethorpe Foundation Mapplethorpe, Robert Conceptual art Twentieth Century post-1945 fine arts styles and movements nonobjective art organizations, nonprofit Art Museums Solomon R. Guggenheim Foundation Messer, Thomas M. Thannhauser collection"
DC Dot Guggenheim Enhanced (2 of 2) dc:description="The Solomon R. Guggenheim Museum is comprised of five related museums. In addition to the New York City Fifth Avenue location, there is also Guggenheim SoHo, NYC, Guggenheim Bilbao, Spain, Deutsche Guggenheim, Berlin, and the Peggy Guggenheim Collection, Italy" dc:publisher="Solomon R. Guggenheim Museum" dc:contributor="Thannhauser, Justin K. Thannhauser, Hilde Guggenheim, Peggy Panza di Biumo, Giuseppe Messer, Thomas M. Krens, Thomas Rebay, Hilla Von Sweeney, James Johnson" dc:date="1920" dc:type="Text Image Sound Place Physical Object Original Collection Cultural" dc:relation="IsPartOf Solomon R. Guggenheim Foundation References http://www.guggenheim.org" dc:rights="Solomon R. Guggenheim Museum" > <dc:format> <rdf:Bag rdf:_1="text/html" rdf:_2="550 bytes"/> </dc:format> </rdf:Description> </rdf:RDF>
Thank You! Lynn Ann Underwood Museum Records Manager Documentation & Records Solomon R. Guggenheim Museum 575 Broadway, 3rd floor New York, NY 10012-4233 lunderwood@guggenheim.org Telephone: (212) 423-3871 Telefax: (212) 360-4340