270 likes | 286 Views
Introduction to Metadata. Jenn Riley Metadata Librarian IU Digital Library Program. Many definitions of metadata. “Data about data” “Structured information about an information resource of any media type or format.” (Caplan)
E N D
Introduction to Metadata Jenn Riley Metadata Librarian IU Digital Library Program
Many definitions of metadata • “Data about data” • “Structured information about an information resource of any media type or format.” (Caplan) • “Structured information that describes, explains, locates, or otherwise makes it easier to retrieve, use, or manage an information resource.” (NISO) • “Metadata is constructed, constructive, and actionable.” (Coyle) • … S504 Spring 2010
Metadata and cataloging • Depends on what you mean by: • metadata, and • cataloging! • But, in general: • Metadata is broader in scope than cataloging • Much metadata creation takes place outside of libraries • Good metadata practitioners use key cataloging principles in non-MARC environments • Metadata created for many different types of materials • Metadata is NOT only for Internet resources! S504 Spring 2010
Some types of metadata S504 Spring 2010
How metadata is used S504 Spring 2010
Creating descriptive metadata • Digital library content management systems • CONTENTdm • ExLibris Digitool • Greenstone • DSpace • Library catalogs • Spreadsheets & databases • Directly in XML (generally not recommended) S504 Spring 2010
Creating other types of metadata • Technical • Generated by and stored in content management system • Stored in separate Excel spreadsheet • Structural • Created and stored in content management system • METS XML • GIS • Using specialized software • Content markup • In XML S504 Spring 2010
Levels of control • Data structure standards (e.g., MARC, MODS) • Data content standards (e.g., AACR2r, RDA) • Encoding schemes • Vocabulary (a.k.a. controlled vocabularies) • Syntax • High-level models (e.g., FRBR, DCAM) S504 Spring 2010
Descriptive metadata • Purpose • Discovery • Description to support use and interpretation • Some common general schemas • MARC • MARCXML • MODS • Dublin Core • LOTS of domain-specific schemas S504 Spring 2010
MODS • “Metadata Object Description Schema” • Developed and maintained by the Library of Congress Network Development and MARC Standards Office • For encoding bibliographic information • Influenced by MARC, but not equivalent • Quickly gaining adoption S504 Spring 2010
Dublin Core (1) • “Core” across all knowledge domains • National and international standard • 2001: Released as ANSI/NISO Z39.85 • 2003: Released as ISO 15836 • No element required • All elements repeatable • 1:1 principle S504 Spring 2010
Dublin Core (2) • Two “flavors” • Unqualified – 15 elements • Qualified • Additional elements • Element refinements • Encoding schemes (vocabulary and syntax) • All qualifiers must follow “dumb-down” principle • Unqualified DC required for sharing metadata via the Open Archives Initiative Protocol for Metadata Harvesting S504 Spring 2010
DCMI Abstract Model • New direction for the Dublin Core Metadata Initiative • An “information model which is independent of any particular encoding syntax” • RDF-inspired, but not RDF • DCMI resource model • DCMI description set model • DCMI vocabulary model • Full abstract model recommendation • Still too early to really know where this is going S504 Spring 2010
Comparing descriptive metadata formats S504 Spring 2010
Data content standards • Anglo-American Cataloging Rules, 2nd edition (AACR2) • Resource Description and Access (RDA) • Actually in some sense also a set of “properties” (which are not quite elements) • Intention is “principles” rather than “rules” • Describing Archives: A Content Standard (DACS) • Cataloging Cultural Objects (CCO) • Also many format-specific guidelines • Descriptive Cataloging of Rare Materials (DCRM) series • Archival Moving Image Materials: A Cataloging Manual • Betz: Graphic Materials • … S504 Spring 2010
Vocabulary encoding schemes • TGM I • TGM II • TGN • GeoNet • AAT • LCSH • LCNAF • DCMI Type • MIME Types • …etc. aka, controlled vocabularies S504 Spring 2010
Syntax encoding schemes • Essentially, “data types” • ISO8601 • W3CDTF • URI • …etc. S504 Spring 2010
Technical metadata • For recording technical aspects of digital objects • Of use for long-term maintenance of data • Some examples: • NISO Z39.87: Data Dictionary – Technical Metadata for Digital Still Images & MIX • Schema for Technical Metadata for Text S504 Spring 2010
Structural metadata • For creating a logical structure between digital objects • Locating the same intellectual content on multiple representations • Noting points of interest within a single resource • Grouping and sequencing multiple files that make up a logical whole • METS is the current primary schema S504 Spring 2010
What is FRBR? • Functional Requirements for Bibliographic Records • 1998 report from IFLA • Conceptual model describing the entities and relationships underlying bibliographic information • Not a data model • Not an element set • Not a record format • The basis of concepts and terminology for RDA S504 Spring 2010
FRBR user tasks • Bibliographic records exist to help users: • Find • Identify • Select • Obtain • …various entities S504 Spring 2010
The core of FRBR: Group 1 Entities WORK “the intellectual or artistic realization of a work” “a distinct intellectual or artistic creation” “the physical embodiment of an expression of a work” “a single exemplar of a manifestation” EXPRESSION is realized through is embodied in w1 Franz Schubert's Trout quintet -e1 the composer's score -e2 a performance by the Amadeus Quartet and Hephzibah Menuhin on piano -e3 a performance by the Cleveland Quartet and Yo-Yo Ma on the cello -. . . . • w1 Ronald Hayman's Playback • -e1 the author's text edited for publication • -m1 the book published in 1973 by Davis-Poynter • -i1 copy autographed by the author MANIFESTATION • w1 Harry Lindgren's Geometric dissections • -e1 original text entitled Geometric dissections • -m1 the book published in 1964 by Van Nostrand • -e2 revised text entitled Recreational problems in geometric dissections .... • -m1 the book published in 1972 by Dover ITEM is exemplified by S504 Spring 2010
Group 2 (those responsible for Group 1 entities) Person Corporate body Group 3 (subjects of Works) Concept Object Event Place Other FRBR entities S504 Spring 2010
WorldCat’sFRBRization – it’s a start. S504 Spring 2010
But that’s not all! • Functional Requirements for Authority Data (FRAD) • Published in 2009 • Adds some new attributes to existing FRBR entities • Adds Group 2 entity Family • Adds entities Name and Identifier • Adds entities Controlled Access Point, Agency, and Rules • Functional Requirements for Subject Authority Data (FRSAD) • Draft issued in 2009 • Gets rid of Concept/Object/Event/Place • In favor of Thema and Nomen • Unclear if these will have the same status as the first FRBR report S504 Spring 2010
FRBR in RDA • Uses FRBR entities strictly but not relationships and attributes • Fairly loose interpretation of many features of FRBR • Structured according to FRBR principles • e.g., “Section 1 - Recording attributes of manifestation and item” • RDA “elements” for each of the FRBR entities being registered as RDF properties • Entirely new purpose for RDA, separate from its role as cataloging rules • Idea is to promote re-use of RDA metadata outside of libraries • Especially in Semantic Web applications • If it works, might show if FRBR principles are useful to non-library communities • We have no idea yet if this is going to work S504 Spring 2010
Further information • jenlrile@indiana.edu • These presentation slides: <http://www.dlib.indiana.edu/~jenlrile/presentations/slis/10spring/s504/s504.ppt> • Metadata librarians listserv: <http://metadatalibrarians.monarchos.com> S504 Spring 2010