160 likes | 282 Views
2004 Texas Library Association Annual Conference, March 18, 2004, San Antonio, TX. A Complex Standard and Its Use Results from an empirical analysis of MARC.
E N D
2004 Texas Library Association Annual Conference, March 18, 2004, San Antonio, TX A Complex Standard and Its UseResults from an empirical analysis of MARC William E. Moen<wemoen@unt.edu>School of Library and Information SciencesTexas Center for Digital KnowledgeUniversity of North TexasDenton, TX 72603
Overview • Context for the analysis -- interoperability • Findings from the analysis • Indexing and MARC • More questions … TLA Annual Conference -- March 18, 2004 -- San Antonio, TX
Context for the analysis • Interoperability across library online catalogs • Indexing of MARC records to support searching • Richness of MARC content designation available • Indexing guidelines prepared for the Z39.50 Interoperability Testbed (Z-Interop) • Implications for indexing guidelines and policies TLA Annual Conference -- March 18, 2004 -- San Antonio, TX
Interoperability testbed project Realizing the Vision of Networked Access to Library Resources: An Applied Research and Demonstration Project to Establish and Operate a Z39.50 Interoperability Testbed • A Institute of Museum and Library Services National Leadership Grant • Goal: Improve Z39.50 semantic interoperability among libraries for information access and resource sharing FOR MORE INFORMATION, VISIT THE PROJECT WEBSITE… http://www.unt.edu/zinterop/ TLA Annual Conference -- March 18, 2004 -- San Antonio, TX
Components of the testbed • Test dataset • 400,000+ MARC 21 records from OCLC’s WorldCat • Z39.50 reference implementations • Z-client (Bookwhere), Z-server & information retrieval system (Sirsi Unicorn) • Test scenarios & searches • Searches with known result records from dataset • Benchmarks • Results of test searches using reference implementations TLA Annual Conference -- March 18, 2004 -- San Antonio, TX
Books: 91% Cartographic Materials: < 1% Electronic resources: < 1% Archival/Mixed Materials: <1% Sound recordings: 4% Visual Materials: 1% Serials: 3% Z-Interop test dataset • Approximately 1% sample of MARC records from OCLC’s WorldCat database • Weighted sampling based on number of libraries “holding” the object represented by the record • 419,657 total MARC records • 89% of records “full level” cataloging • Formats represented in test dataset TLA Annual Conference -- March 18, 2004 -- San Antonio, TX
MARC 21 content designation TLA Annual Conference -- March 18, 2004 -- San Antonio, TX
Content designation in dataset TLA Annual Conference -- March 18, 2004 -- San Antonio, TX
Summary frequency results Total number of fields/subfields occurring in dataset = 13,849,499 Only 4% of all fields/subfields account for 80% of all occurrences or 96% of all fields/subfields account for 20% of all occurrences TLA Annual Conference -- March 18, 2004 -- San Antonio, TX
Characteristics of top 36 • Most frequently occurring: 650 $a [Subject data] • 2nd most frequently occurring: 040 $d [Cataloging source] • 3rd & 4th most frequently occurring: 260 $a & $b [Publication information] • 5th most frequently occurring: 245 $a [Title] • Contain data useful to end users: 28 • Contain control numbers, etc.: 5 • Contain data useful to catalogers: 3 TLA Annual Conference -- March 18, 2004 -- San Antonio, TX
Indexing & MARC • Indexing Guidelines to Support Z39.50 Profile Searches • Identified all MARC 21 fields/subfields that may contain author, title, or subject data • Author-related fields/subfields : 119 • AuthorTitle-related fields/subfields: 21 • Title-related fields/subfields: 253 • Subject-related fields/subfields: 144 • 537 fields/subfields contain author, title, subject data • Usefulness of indexing all possible fields? • How often are these fields/subfields used? TLA Annual Conference -- March 18, 2004 -- San Antonio, TX
Occurrences in test dataset • 381 occur one or more times in Z-Interop dataset • Author, title, or subject fields/subfields inZ-Interop dataset • Author-related fields/subfields : 86 • AuthorTitle-related fields/subfields: 16 • Title-related fields/subfields: 178 • Subject-related fields/subfields: 101 • 19 of the 381 (5%) account for 80% of all occurrences • 9 of 19 are subject-related • 5 of 19 are author-related • 5 of 19 are title-related • The 19 fields/subfields TLA Annual Conference -- March 18, 2004 -- San Antonio, TX
Implications for indexing • What difference does indexing decisions make? • Preliminary testing using the 19 fields/subfields: • 95% - 100% of correct records retrieved! • Is there a systematic method to identify the “best” fields/subfields to index? • Per format of materials? • Per user (librarians and end users) needs? • Good enough search results? TLA Annual Conference -- March 18, 2004 -- San Antonio, TX
Inquiring minds want to know… • What is the extent of catalogers’ use MARC 21 content designation as indicated by analyses of large random samples of MARC records? • What does the empirical evidence of MARC 21 content designation use suggest about a set of common or core elements in bibliographic records per format or type of material • What is the relationship between the availability of new MARC content designation and its subsequent adoption and use? • What methodology is appropriate to identify and understand factors contributing to cataloger’s utilization of available content designation and the interplay between MARC and the entire cataloging enterprise? TLA Annual Conference -- March 18, 2004 -- San Antonio, TX
To the future and beyond • Given solid empirical data on use of MARC content designation… • The records are artifacts of the cataloging enterprise – what can we learn about cataloger practices? • Are records complete enough to support FRBR applications? • What are the implications for standards developers for the evolution of metadata and encoding schemes? • Will we XML’ize MARC content designation whether it is used or not? TLA Annual Conference -- March 18, 2004 -- San Antonio, TX
References • Assessing Metadata Utilization: An Analysis of MARC Content Designation Use • http://www.unt.edu/wmoen/publications/MARCPaper_Final2003pdf.pdf • Z39.50 Interoperability Testbed • http://www.unt.edu/zinterop/ • Indexing Guidelines to Support Z39.50 Profile Searches • http://www.unt.edu/zinterop/Documents/IndexingGuidelines1Feb2002.pdf TLA Annual Conference -- March 18, 2004 -- San Antonio, TX