1 / 39

MARC Content Designation and Utilization

Inquiry and Analysis. MARC Content Designation and Utilization. Future of MARC: Challenges and Opportunities of 21 st Century Cataloging William E. Moen <wemoen@unt.edu> School of Library and Information Sciences Texas Center for Digital Knowledge University of North Texas.

scott
Download Presentation

MARC Content Designation and Utilization

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Inquiry and Analysis MARC Content Designation and Utilization Future of MARC: Challenges and Opportunities of 21st Century Cataloging William E. Moen <wemoen@unt.edu>School of Library and Information SciencesTexas Center for Digital KnowledgeUniversity of North Texas Research funded by a National Leadership Grant from the Institute for Museum and Library Services. Additional support provided by the University of North Texas School of Library and Information Sciences and the Texas Center for Digital Knowledge.

  2. To start… • Discussion of the future of MARC is only partially about MARC • The broader digital information landscape • Technologies • Cataloging practices • The possible diminishing market share of: • Libraries in the information marketplace • Library catalogs as a resource discovery tool Massachusetts Library Association Conference -- May 2007 -- Sturbridge, MA

  3. Calhoun’s report Today, a large and growing number of students and scholars routinely bypass library catalogs in favor of other discovery tools, and the catalog represents a shrinking proportion of the universe of scholarly information. The catalog is in decline, its processes and structures are unsustainable, and change needs to be swift. Today’s research library catalogs—even those that include records for thousands of scholarly e-journals and databases—reflect only a small portion of the expanding universe of scholarly information. Library catalogs manage description and access for mostly published resources—tangible materials such as books, serials, and audiovisual media, plus licensed materials such as abstracting and indexing services, full text databases, and electronic journals and books… In contrast, the stuff of cultural heritage collections, digital assets, pre-print services and the open Web, research labs, and learning management systems remain for the most part outside the scope of the catalog. Massachusetts Library Association Conference -- May 2007 -- Sturbridge, MA

  4. When we say MARC? • Record format • Defined by ISO 2709/ANSI Z39.2 • Structural elements of the format • Metadata scheme • Defined by MARC 21 • Fields, subfields, indicators and their semantics Massachusetts Library Association Conference -- May 2007 -- Sturbridge, MA

  5. Approaching MARC’s future • Requirements for a record format / metadata scheme • Responding to recent developments • Looking at empirical data Massachusetts Library Association Conference -- May 2007 -- Sturbridge, MA

  6. Thinking about requirements • Goldsmith & Knudson’s Requirements LANL’s DL Repository • Granularity • lossless data mapping without losing the finer shades of meaning intrinsic to the original data • Transparency • necessary for seamless data interchange, requiring a standard widely known throughout the digital library community. • Extensibility • in order to permit changes to the general structure without breaking the whole or requiring reprocessing of already ingested materials. • Tennant’s Requirements for Bibliographic Infrastructure • XML-based format • Modularity • Hierarchy support • Community-supported tool sets • And others… Massachusetts Library Association Conference -- May 2007 -- Sturbridge, MA

  7. Thinking about requirements • McCallum’s 10 format attributes for MARC Forward • XML • Granularity • Versatility • Extensibility • Modularity • Hierarchy support • Crosswalks • Tools • Cooperative management • Pervasive Massachusetts Library Association Conference -- May 2007 -- Sturbridge, MA

  8. Recent developments • Functional Requirements for Bibliographic Records • IFLA Study Group on Functional Requirements for Bibliographic Records, 1992-1995 • “A conceptual model for the bibliographic universe” (B. Tillett, 2003). The aim of the study was to produce a framework that would provide a clear, precisely stated, and commonly shared understanding of what it is that the bibliographic record aims to provide information about, and what it is that we expect the record to achieve in terms of answering user needs. Massachusetts Library Association Conference -- May 2007 -- Sturbridge, MA

  9. The FRBR model • Based on Entity-Relationship modeling • Entity – something that can be described • Attributes – the features of the entity that characterize it • Relationships between entities • Three groups of entities in model • Group 1: Products of intellectual or artistic endeavor • Group 2: Entities responsible for the intellectual or artistic content, the physical production, etc. • Group 3: Entities that serve as the subjects of intellectual or artistic endeavor • Remember: what it is that the bibliographic record aims to provide information about Massachusetts Library Association Conference -- May 2007 -- Sturbridge, MA

  10. FRBR – Group 1 Entities Massachusetts Library Association Conference -- May 2007 -- Sturbridge, MA

  11. FRBR -- Group 2 Entities Massachusetts Library Association Conference -- May 2007 -- Sturbridge, MA

  12. FRBR – Group Three Entities Massachusetts Library Association Conference -- May 2007 -- Sturbridge, MA

  13. FRBR user tasks • Remember: what it is that we expect the record to achieve in terms of answering user needs • Four user tasks: • Find: Discovering if something exists by searching one or more attributes • Identify: Examine retrieved records to determine the items that met user’s search request • Select: Examine retrieved records for those that meet other user needs/requirements • Obtain: Using data in retrieved records to gain physical access to the described object Massachusetts Library Association Conference -- May 2007 -- Sturbridge, MA

  14. Impact on cataloging and catalogs • Introduces new terminology and conceptual model incorporated in: • RDA • Statement on cataloging principles • Assisting in understanding better the range of relationships in the bibliographic universe • Collocation function of the catalog • Improve linking mechanisms • Implementation in catalogs to improve user experience Massachusetts Library Association Conference -- May 2007 -- Sturbridge, MA

  15. Recent developments • Revision of the Anglo-American Cataloguing Rules • No AARC 3 • Resource Description and Access (RDA) • Focus on guidelines for content creation • Separation from syntax or record format • Designing the future -- Library Systems and Data Formats (wiki) • Grassroots effort to address next generation library catalog and data format Massachusetts Library Association Conference -- May 2007 -- Sturbridge, MA

  16. Metadata • Essential in library applications • Variety of metadata schemes • Variety of functions and services supported • Increasing use of machine-generated metadata • Role of handcrafted metadata needs continuing review and assessment • Research on use of metadata schemes can provide empirical data for decisions Massachusetts Library Association Conference -- May 2007 -- Sturbridge, MA

  17. Metadata record as artifact • Metadata creation as process • Resulting metadata records as artifacts of the process • Artifact reflects decisions, policies… • Artifact can be investigated to understand metadata utilization decisions • Decisions to use or not use available metadata elements Massachusetts Library Association Conference -- May 2007 -- Sturbridge, MA

  18. Metadata – rules & practice • Library catalogers create metadata – bibliographic records • Follow cataloging rules and other standards to create the bibliographic data • Encode the bibliographic data into MARC records • MARC – communications format and metadata scheme • Approximately 2,000 structures for encoding data Massachusetts Library Association Conference -- May 2007 -- Sturbridge, MA

  19. Richness of MARC Massachusetts Library Association Conference -- May 2007 -- Sturbridge, MA

  20. What do catalogers use? • Given the cataloging rules… • Given the detailed structuring of bibliographic data in MARC records… • Given training of the catalogers… • Given local policies and practices… • What can we learn by examining a large set of MARC bibliographic records? Massachusetts Library Association Conference -- May 2007 -- Sturbridge, MA

  21. Why study MARC utilization? • Standard record structure for exchange of descriptive and other types of metadata • Evolved since late 1960s as key mechanism for sharing metadata among libraries • Metadata record with approximately 2,000 elements available • Approximately 200 fields • Approximately 1800 subfields or other structures • To what extent is the richness/complexity exploited and to what purpose? • See Goldsmith and Knudson regarding Los Alamos Research Library choice of a metadata scheme Although often disparaged or dismissed in the library community, the MARC standard, notably the MARCXML standard, provides surprising flexibility and robustness for mapping disparate metadata to a vendor- neutral format for storage, exchange, and downstream use. Massachusetts Library Association Conference -- May 2007 -- Sturbridge, MA

  22. Occurrence summary • Only 4% of all fields/subfields account for 80% of all occurrences • 96% of all fields/subfields account for only 20% of all occurrences Massachusetts Library Association Conference -- May 2007 -- Sturbridge, MA

  23. The MCDU Project • MARC Content Designation Utilization • Provide empirical evidence of catalogers’ use of MARC content designation • Identify commonly used elements of bibliographic records • Contribute to community discussion about core elements in MARC bibliographic records • Explore the evolution of MARC content designation • Develop research approach to understand the factors influencing levels of MARC content designation use Massachusetts Library Association Conference -- May 2007 -- Sturbridge, MA

  24. Project deliverables • Reports containing results of analysis of utilization • Reports addressing commonly used elements • Across formats • In context of national recommendations (e.g., BIBCO) • In context of FRBR user tasks • HistoriMARC • Database of MARC historical information about evolution of fields/subfields, etc. • Enable analysis of patterns of adoption and utilization • A methodology to understand factors influencing catalogers’ use of MARC • Software tools and methods for others to use Massachusetts Library Association Conference -- May 2007 -- Sturbridge, MA

  25. Dataset and preparation • 56,177,383 MARC 21 Bibliographic Records from OCLC WorldCat • Decomposed the records to store in MySQL • Parsing Tool • 82 hours to process and load records • 295 GB final database size (with indexing) • Structuring of decomposed records align with analytical questions Massachusetts Library Association Conference -- May 2007 -- Sturbridge, MA

  26. Additional data preparation • Analysis required determining frequency counts by format of material (ten) • Concern about significant differences in patterns of utilization between Library of Congress and OCLC member cataloging • Partitioned decomposed data into 20 databases • Based on source of cataloging • Based on format of material Massachusetts Library Association Conference -- May 2007 -- Sturbridge, MA

  27. Massachusetts Library Association Conference -- May 2007 -- Sturbridge, MA

  28. Categories of questions • General profile of the dataset (e.g.): • What is the distribution of records by Type of Record? • What is the distribution of records by Encoding Level? • Occurrences of content designation structures: • What is the number of total occurrences of all control and data fields and how many unique field tags are used? • In how many and in what percentage of records is each unique field/subfield combination used at least once? Massachusetts Library Association Conference -- May 2007 -- Sturbridge, MA

  29. Example results • 7,595,887 LC-created records in dataset • Type of Record: Book, Pamphlets, and Printed Sheets • Total number of unique fields occurring: 167 • Number of fields accounting for 80% of occurrences: 14 fields (8.3%) • Number of fields accounting for 90% of occurrences: 21 fields (12.6%) • Approximately 110 fields (66%) occur in less than 1% of all records [Note: Fields are cataloger-supplied, not system-supplied] Massachusetts Library Association Conference -- May 2007 -- Sturbridge, MA

  30. Massachusetts Library Association Conference -- May 2007 -- Sturbridge, MA

  31. Making sense of numbers • Frequency counts provide raw but informative data • Threshold – concept to delineate a change in trend in utilization • Determining commonly occurring elements • Comparing to recommended core records • Comparing to recommendations for national level records • Comparing the FRBR user tasks data Massachusetts Library Association Conference -- May 2007 -- Sturbridge, MA

  32. Element use and FRBR tasks • FRBR describes four user tasks • Find • Identify • Select • Obtain • Are library catalogers providing data to support FRBR tasks? • Delsey mapped these tasks to MARC CDS for FRBR entities Massachusetts Library Association Conference -- May 2007 -- Sturbridge, MA

  33. FRBR user task: Find (search) • MARC 21 fields/subfields that can contain author, title, or subject data • Author-related fields/subfields : 119 • AuthorTitle-related fields/subfields: 21 • Title-related fields/subfields: 253 • Subject-related fields/subfields: 144 • In FRBR context, Delsey identified: • Approximately 460 fields/subfields can support this task for the FRBR entities • In MCDU dataset, only 59 (13%) of these occur at or above the threshold of use in OCLC book records Massachusetts Library Association Conference -- May 2007 -- Sturbridge, MA

  34. Questions for consideration? • What is needed in a bibliographic record? • Support for the four user tasks? • In context of FRBR, what does it mean to support a user task? • Management of information resources? • How do your systems use the infrequently used data? • What about the 62% of all fields used in less than 1% of the records? Massachusetts Library Association Conference -- May 2007 -- Sturbridge, MA

  35. Questions for consideration? • Can you argue persuasively for the cost/benefit of your existing practice? • Should the focus be on high-value, high-impact, high-quality data in a few fields/subfields? • Can you identify these few fields/subfields? • What would it mean for costs of cataloging? • What would this mean for training? • Can MCDU results inform your local practices? Massachusetts Library Association Conference -- May 2007 -- Sturbridge, MA

  36. New cataloging practices? • Select the appropriate metadata scheme. • Use level of description and schema (DC, LOM, VRA Core, etc,) appropriate to the bibliographic resource. Don’t apply MARC, AACR2, and LCSH to everything. • Consider …abandoning the use of controlled vocabularies [LCSH, MESH, etc] for topical subjects in bibliographic records. • Manually enrich metadata in important areas • Enhance name, main title, series titles, and uniform titles for prolific authors in music, literature, and special collections. • Automate Metadata Creation • Encourage the creation of metadata by vendors, and its ingestion into our catalog as early as possible in the process. • Import enhanced metadata whenever, wherever it is available from vendors and other sources. Rethinking How We Provide Bibliographic Services for the University of California (December 2005) Massachusetts Library Association Conference -- May 2007 -- Sturbridge, MA

  37. Confluence for change • Within library community… • Influence of FRBR concepts and model for metadata • Resource Description and Access (RDA) • Re-examination of library catalog and its position within the landscape of resource discovery tools • Development of a bibliographic metadata element set • Next generation “MARC” Massachusetts Library Association Conference -- May 2007 -- Sturbridge, MA

  38. References • MARC Content Designation Utilization Project • http://www.mcdu.unt.edu/ • Moen and Benardino. (2003). Assessing Metadata Utilization: An Analysis of MARC Content Designation Use • http://www.unt.edu/wmoen/publications/MARCPaper_Final2003pdf.pdf • Goldsmith and Knudson. 2006. Repository Librarian and the Next Crusade: The Search for a Common Standard for Digital Repository Metadata • http://www.dlib.org/dlib/september06/goldsmith/09goldsmith.html • Roy Tennant. (2004). A Bibliographic Metadata Infrastructure for the Twenty-first Century • Sally H. McCallum. (2006). MARC Forward. • http://www.rlg.org/en/pdfs/Forum.8-06.McCallum.pdf Massachusetts Library Association Conference -- May 2007 -- Sturbridge, MA

  39. References • Designing the future -- Library Systems and Data Formats • http://futurelib.pbwiki.com/ • Barbara Tillett. (2003). What is FRBR? A Conceptual Model for the Bibliographic Universe. • http://www.loc.gov/cds/downloads/FRBR.PDF • Karen Calhoun. (2006). The Changing Nature of the Catalog and its Integration with Other Discovery Tools • http://www.loc.gov/catdir/calhoun-report-final.pdf • Bibliographic Services Task Force. (2005). Rethinking How We Provide Bibliographic Services for the University of California • http://libraries.universityofcalifornia.edu/sopag/BSTF/Final.pdf Massachusetts Library Association Conference -- May 2007 -- Sturbridge, MA

More Related