1 / 26

Using Metadata in CONTENTdm

Using Metadata in CONTENTdm. Diana Brooking and Allen Maberry Metadata Implementation Group, Univ. of Washington Crossing Organizational Boundaries Oct. 29, 2002 . Outline. The metadata “environment”: factors that influence basic decisions

Anita
Download Presentation

Using Metadata in CONTENTdm

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Using Metadata in CONTENTdm Diana Brooking and Allen Maberry Metadata Implementation Group, Univ. of Washington Crossing Organizational Boundaries Oct. 29, 2002

  2. Outline • The metadata “environment”: factors that influence basic decisions • Structure of metadata: Dublin Core, field structure in CONTENTdm • Content standards: what goes into the fields, formatting, controlled vocabularies • The data dictionary: bringing it all together

  3. Metadata: what is it? • Data about data • “Metadata are data that describe the attributes of a resource; characterize its relationships; support its discovery, management, and effective use; and exist in an electronic environment.”(Sherry Vellucci, LRTS 44 (1), 1999) • Commonly known as cataloging

  4. Metadata: how is it used? • For description: information for display with the image • For searching: users search for images by searching for text attached to the image

  5. Basic Decisions: Description • How much information do you have? • How much information do your users need/want? • What is depicted in the image? • Who created it? • Why is it important? Why did you select it? • How much detail do you need to go into?

  6. Basic Decisions: Searching • How will users find the images? What will they be looking for? What aspects are they interested in? • How will you find the images? What are your staff’s needs? • At what level do you need to distinguish images from one another? • At what level do you need to bring like resources together?

  7. Decision Factors • Size of file • 50 images (small enough to browse) • 10,000 images (need for more precise searching) • 10,000 images of many different things vs. 10,000 images of trains

  8. Decision Factors • Audience • General public vs. specialists (e.g., railroad enthusiasts) • Institutional mission • Say you are a railroad museum (audience expectations)

  9. Decision Factors • Legacy data • Starting from scratch • Years of good cataloging • Years of inconsistent cataloging • Software issues • What kind of data can the system handle? • What are its search capabilities • Short-term vs. long-term view

  10. Basic Dublin Core Metadata • What is the Dublin Core Metadata Element Set (DCMES) • Why was it developed, and how has it been developed. • A short history of the DC Initiative is available at http://www.dublincore.org/about/overview/

  11. Dublin Core Metadata Element Set • There are15 basic elements • See Dublin Core Element Set, Version 1.1 - Reference Description • But, it is adaptable and expandable to fit the needs of different users by the use of “Applications profiles”

  12. Dublin Core and CONTENTdm • CONTENTdm is designed around the Dublin Core • (Very) basic overview of how CONTENTdm works • CONTENTdm uses DC element names as file names • Because each database has constant file names it is easy to combine them to search either one or more collections

  13. Dublin Core mapping • An example: • Collection A has a field “Photographer” mapped to DC:Creator, and Collection B has a field “Artist” mapped to DC:Creator. Searching across both databases searches the CONTENTdm index “Creat*” and retrieves data from the index for both “Photographers” and “Artists” for collections A + B or A+B+n…

  14. Dublin Core and searching • What are the practical consequences of this? • In cross database searching, one can search on specific fields. However, the names of these fields will not be Photographer or Artist, but “Creator” because that is the common name of the index in each collection. • However you can do a keyword search on all “searchable” fields in the database whether they are mapped to a Dublin Core field or not.

  15. Modern Book Arts field labels bibliographic description = descr0 text production = descr1 image production = descr2, etc. Cross-database search index Description = descr*

  16. Dublin Core tips • It is important to make sure that you are careful about what information you put in searchable fields, even if they are not mapped to a DC element. • If you have multiple collections it is very important to make sure that the same type of data is mapped to the same DC elements consistently

  17. Content Standards • Used for choosing and formatting the data that goes into the fields. • Increase coherence and intelligibility of description • Enhance reliability of retrieval • Enable compatibility with other collections (cross-database searching) • Makes maintenance and possible migration of data to other software easier

  18. Standards = Consistency • “Date” field: dates should always be formatted the same way • “Photographer” field: same person’s name should always appear in the same form • “Subject” field: same topic should have the same term used to describe it across images • If different terms or formats are used, the user may not even realize that more than one search is necessary

  19. Examples of Content Standards For description: • Anglo-American Cataloging Rules, 2nd ed., 2002 revision (libraries) • Graphic Materials: Rules for Describing Original Items and Historical Collections, 1982; revisions available electronically (libraries, also museums, historical societies, LC Prints & Photo., CORBIS)

  20. Content Standards: Controlled Vocabularies “Any subset of the lexicon of a natural language. A list of preferred and nonpreferred terms produced by the process of vocabulary control. Types of controlled vocabularies include subject heading lists and thesauri.” (NISO)

  21. Controlled vocabs for which fields? • When you need consistency across images, user searches to find all … • Proper names for things (people, places, etc.) • Subjects depicted in the images • Not necessary when you have… • Fields that contain data more likely to be unique to the particular image (title, notes, other free text fields)

  22. Remember… You can have fields that don’t use controlled vocabularies, but where you still need consistency in format: • Dates • Image numbers • Physical description • You could create your own controlled vocab lists (if you really had to)

  23. Controlled Vocabularies For names: • Library of Congress/National Authority File: http://authorities.loc.gov • Union List of Artist Names (Getty): http://www.getty.edu/research/tools/vocabulary/ulan • USGS Geographic Names Information System: http://geonames.usgs.gov/gnishome.html

  24. Controlled Vocabularies For subjects: • Library of Congress Subject Headings: http://authorities.loc.gov • LC Thesaurus for Graphic Materials: http://www.loc.gov/rr/print/tgm1 • Art & Architecture Thesaurus (Getty): http://www.getty.edu/research/tools/vocabulary/aat • Chenhall’s Nomenclature (The Revised Nomenclature for Museum Cataloging. Walnut Creek: Altamira Press, 1995)

  25. Vocabulary conflicts? • DC Subject: LCSH vs. AAT • Church buildings vs. Churches • DC Coverage: LC vs. Board of Geographic Names • Moscow vs. Moskva • Challenge of meeting needs of diverse collections and users, while maintaining consistency within and between databases

  26. Data Dictionaries For each project a data dictionary documents: • Database-specific field labels • Mapping of fields to DC elements • Data formatting instructions for each field • Recommended controlled vocabularies • UW data dictionaries: http://www.lib.washington.edu/msd/mig/datadicts/default.html • MOHAI

More Related