490 likes | 1.04k Views
Shareable Metadata for Non-bibliographic Materials: Implications for Libraries and Archives Murtha Baca WebWise 2007 Pre-conference #2 February 28, 2007 Typology of Data Standards Data structure standards (metadata element sets): MARC, EAD, Dublin Core, CDWA, VRA Core, TEI
E N D
Shareable Metadata for Non-bibliographic Materials: Implications for Libraries and Archives Murtha Baca WebWise 2007 Pre-conference #2 February 28, 2007
Typology of Data Standards • Data structure standards (metadata element sets): MARC, EAD, Dublin Core, CDWA, VRA Core, TEI • Data value standards (vocabularies): LCSH, LCNAF, TGM, AAT, ULAN • Data content standards (cataloging rules): AACR (RDA), ISBD, CCO, DACS • Data format/technical interchange standards (metadata standards expressed in machine-readable form): MARC, MARCXML, MODS, EAD, CDWA Lite XML, Dublin Core Simple XML schema, VRA Core 4.0 XML schema, TEI XML DTD
It’s not all—and not only—about MARC (or EAD, etc.) • For non-bibliographic materials such as art objects, architecture, and other cultural works, metadata schemas other than MARC may be more appropriate. • Metadata for these types of materials in libraries and archives, when not expressed in MARC, has often made use of idiosyncratic, non-standard “local” schemas.
Some Emerging Trends in Metadata Creation • “Schema-agnostic” metadata • Metadata that is both shareable and re-purposeable • Harvestable metadata (OAI/PMH) • “Non-exclusive”/”cross-cultural” metadata—i.e., it’s okay to combine standards from different metadata communities—e.g. MARC and CCO, DACS and AACR, DACS and CCO, EAD and CDWA Lite, etc. • Importance of authorities—and difficulties in “bringing along” the power of authorities with shared metadata records • The need for practical, economically feasible approaches to metadata creation
EAD (data structure) & DACS (data content) used at the collection level for an archival collection with a common provenance, used in combination with…
Class: Contemporary Art Work Type: multimedia Creator: Claes Oldenburg (American sculptor, draftsman, and printmaker, born 1929 in Sweden) Title: False Food Selection Creation Date: ca. 1965 Materials: plastic box containing artificial food made of plastic Measurements: 13.5 x 18 x 5 cm (5 3/8 x 7 x 2 inches) Style: Fluxus Subject: box; food; biscuits; petit fours; kaiser roll; eggs; bacon Current Location: Special Collections, Research Library, Getty Research Institute (Los Angeles, California) (890164 bx.205) Description: The box of repository’s copy is blue and contains 3 different biscuits, 3 different petit fours in paper baking cups, a pear, a kaiser roll, and 2 sunny-side up eggs and a strip of bacon glued to the inside of the lid. Related Work: Relationship Type: part of [link to Related Work:] Brown, Jean (American, 1911-1994). Jean Brown Papers, 1916-1995. …CDWA Lite (data structure/data format) & CCO (data content) at the item level for an individual work within the same collection
MARC (data structure/data format) and AACR (data content) used for a “parent” item (18th-century book with engravings) in OPAC, used in combination with…
… CDWA Lite (data structure/data format) & CCO (data content) used at the item level for an individual engraving from the “parent” work represented in the preceding MARC record. Class: Prints Work Type: engraving Creator: Unknown Spanish Title: Table Setting for Sixty Covers Creation Date: ca. 1747 Materials/Techniques: engraving on laid paper Measurements: plate mark 14.6 x 20 cm (5 34/ x 7 3/4 inches), on sheet 16 x 21.1 cm (6 3/8 x 8 3/8 inches) Subject: table setting; food; decoration; centerpieces; confectionery; garnishes; cookery; desserts; tablecloths; tabletop fountains; food presentation; courts; courtiers Description: Table setting for sixty covers described under the entry “Mesa de sesenta cubiertos, larga, y sus esquinas redondas.” The sculptural decoration represents a rampart and its fortified towers (no. 1). The table with rounded corners is adorned with platters of glass (no. 2), and vessels for holding sweets, sugar, and caramel figures, compotes, cakes, cheese, and fruit. Current Location: Special Collections, Research Library, Getty Research Institute (Los Angeles, California) (1405.324_pl6) Related Work: Relationship Type: part of [link to Related Work:] Juan de la Mata, (Spanish, 18th century); Arte de reposteria. Madrid: 1747.
Cross-cultural Standards in Use at the Morgan Library & Museum
AACR-based cataloging codes used by the Morgan: Anglo-American Cataloging Rules AACR “satellites”: Descriptive Cataloging of Rare Books 040 $e dcrb Elisabeth W. Betz. Graphic Materials : Rules for Describing Original Items and Historical Collections 040 $e gihc Steven L. Hensen. Archives, Personal Papers and Manuscripts (Washington : Library of Congress) 040 $e appm [AMREMM] Courtesy of Elizabeth O’Keefe, Director of Collection Information Systems, Morgan Library
Non-AACR-based cataloging codes used by the Morgan: Cataloging Cultural Objects [040 $e cco] (for art objects & visual works) Describing Archives : A Content Standard (Chicago : Society of American Archivists) 040 $e dacs (for finding aids) Local guidelines for seals and seal impressions Courtesy of Elizabeth O’Keefe, Director of Collection Information Systems, Morgan Library
Why so many different cataloging codes? The Morgan discovered that there were crucial bits of data that could not be brought out by strict application of AACR. The Morgan discovered that strict application of AACR would lead to dysfunctional displays in the OPAC. The Morgan chose in some instances to part company with AACR … thereby anticipating both CCO as well as some impending changes to AACR in RDA. Courtesy of Elizabeth O’Keefe, Director of Collection Information Systems, Morgan Library
Morgan MARC record for a cultural object Courtesy of Elizabeth O’Keefe, Director of Collection Information Systems, Morgan Library
CDWA Lite for Materials from a Library’s Photo Archive:The Getty Research Institute’s Tapestries Collection
Decisions • The Getty Research Institute decided to contribute much fuller records to ARTstor than the Getty Museum did (Getty Museum records and images are “Visible Web,” while the GRI Tapestries records & images are “Deep Web.” • GRI contributed more than one “resource” (image) per metadata record, when available (Getty Museum did not). • Data providers have considerable leeway in how basic or how full they want their harvestable records to be, and which resources they want to contribute.
Issues • GRI’s Photo Study Collection uses a variety of non-standard (but mappable), locally-developed metadata schemas. • Metadata records are “hybrid”—describe both the work and the image. • Some cataloging decisions that work locally don’t translate well in a union environment.
Research Library, Getty Research Institute Photo Study Collection Study Images of Tapestries collection Web page
Data structure/mapping/ dumbing down, “flattening” issues for shared metadata: Lost in Translation?
Choose the most appropriate schema to express your data. When mapping to another schema, be aware that some loss of granularity &/or context is inevitable. An institution’s intention to contribute to union catalogs may (should?) affect local cataloging practices—think about our metadata outside of its local context.
Roy Tennant, “Bitter Harvest: Problems & Suggested Solutions for OAI-PMH Data & Service Providers,” http://www.cdlib.org/inside/projects/harvesting/bitter_harvest.html “... this low barrier [for contributing metadata via the OAI/PMH] does not preclude a much higher ceiling [than simple Dublin Core], and the OAI-PMH specifically allows the use of much richer metadata schemes.” Also see http://oai-best.comm.nsdl.org/cgi-bin/wiki.pl?MultipleMetadataFormats
Lack of Vocabulary Control/ No Mapping of Different Vocabularies: More Translation Problems!
Example from RLG Cultural Materials: US & UK forms of the same term are not linked, so users must do 2 separate searches
Example from Los Angeles County Museum of Art (LACMA) integrated search of museum & library collections
LACMA museum & library use different name forms for the same artist—requiring user to know those forms, and search under all of them.
LACMA library holdings include yet another name form for the same artist.
Controlled vocabulary (ULAN) links all variants for this artist
Getty Research Institute Special Collections also “forces” users to use the single form that happens to occur in their EAD finding aids.
A search on a popular variant name for the artist results in no hits, even though this artist *is* there, under another name form.
A search on a popular variant name for the artist results in no hits, even though this artist *is* there, under another name form.
Getty site-wide search (of HTML pages) uses the power of the variants in ULAN, so that users can search on any variant and still get results.
Same situation at the National Gallery of Art—only the museum’s name forms are available in searching collections.
Controlled vocabulary (ULAN) links all variants for this artist
Solutions to the Vocabulary Problem? • Include variants, broader & narrower terms in metadata record for works (labor-intensive, redundant) • Service providers/aggregators employ the appropriate controlled vocabularies & thesauri as “search assistants” (promising, but not now a reality) • Can OCLC’s Terminologies Service help? http://www.oclc.org/terminologies/
Sharing Metadata Created according to Different Data Content (=cataloging) standards, or no standards at all: How does this help (or hinder) end-users?
Examples of a core metadata element (Title), encoded according to different data content (=cataloging, syntax) rules:
Title (AACR): Aquila intermedios leones vinum album et rubrum fundentes
Title (CCO): Arch Decorated with Double-headed Imperial Eagle and Gilt Lions Spouting White and Red Wine
The Way Forward? • Service providers should be more demanding (i.e. require that data providers adhere to certain standards and use certain vocabularies, require “pre-washed” metadata. • Data providers should consistently use appropriate standard schemas in their local systems. • Service providers should consider “adding value” via services like vocabulary mapping, query expansion, vocabulary-assisted searching, user-added metadata, post-harvest subsetting, metadata enhancement, etc. (See Tennant article.)
Lessons Learned • Metadata (descriptive, technical, rights, administrative, preservation) is one of your biggest investments. • Do it once, do it right (consistent schemas, controlled vocabularies), and you can re-purpose metadata in a wide variety of ways. • Good descriptive metadata records can be core—records don’t need to be “full” to be “good.” • Creation of consistent, standards-based descriptive metadata (a.k.a. cataloging!) is time- and labor-intensive, but it’s worth it.
mbaca@getty.edu http://www.getty.edu/research/conducting_research/vocabularies/ http://www.getty.edu/research/conducting_research/standards/