260 likes | 401 Views
By Carrie Moran. An Analysis of the use of MODS in Digital Repositories. To examine the Metadata Object Description Schema (MODS) metadata scheme to determine its utility based on structure, interoperability and metadata quality. . Project goal.
E N D
By Carrie Moran An Analysis of the use of MODS in Digital Repositories
To examine the Metadata Object Description Schema (MODS) metadata scheme to determine its utility based on structure, interoperability and metadata quality. Project goal
Developed by the Library of Congress’ Network Development and MARC Standards Office (Guenther, 2010) • Purpose: to provide a schema and guidelines for encoding a resource description Mods <History>
Goals • Support localization and customization needs • Accommodate widely adopted descriptive practices • Maintain a relatively small number of elements and attributes to reduce training, application, and implementation costs • Support the communication of resource and authority descriptions • Support validation of the encoding • Allow use of MODS/MADS elements by other standards and in application profiles • Maintain continuity of structure and content • Maintain a single way to encode a piece of information • Accommodate indexing of data in the description • Accommodate presentation of data in the description • Make element and attribute names as intelligible as possible to a general audience • Allow for extensibility to include data from richer element sets • Accommodate information about the metadata and record itself • Accommodate conversion to and from other commonly used resource and authority description encodings (such as Dublin Core, MARC, VRA Core) • Accommodate controlled vocabularies that are commonly used in resource and authority description • Allow full description of whole-to-part and similar types of relationships • Support encoding a description for any type of resource • Support encoding the relationship of an agent to a resource (from http://www.loc.gov/standards/mods/design-principles-mods-mads.html) MODS <goals>
Implementation Registry: projects using MODS that are in planning, in progress, and completed • There are currently 34 projects in the Implementation Registry • MODS is currently being used for a variety of purposes and formats • Example: used by UC Berkeley for Computer Science Technical Reports; Archival, Rare and Fragile Collections; and Digitized Tables of Content MODS <community>
Expressed in XML format • Composed of 20 top level elements and 56 sub-elements • Each element can be combined with attributes to allow for more precise records • Each element can be used multiple times throughout a single record, with the exception of <recordInfo> • There are no mandatory or standard elements • Elements can be presented in any order MODS <structure>
Top Level Elements: • <abstract> • <accessCondition> • <classification> • <extension> • <genre> • <identifier> • <language> • <location> • <name> • <note> • <originInfo> • <part> • <physicalDescription> • <recordInfo> • <relatedItem> • <subject> • <tableOfContents> • <targetAudience> • <titleInfo> • <typeOfResource> MODS <elements>
<mods version="3.3" > <titleInfo> <title>Learning XML [electronic resource]</title> </titleInfo> <name type="personal"> <namePart>Ray, Erik T.</namePart> </name> <typeOfResource>text</typeOfResource> <originInfo> <place> <placeTerm type="text">Beijing</placeTerm> </place> <place> <placeTerm type="text">Cambridge, Mass.</placeTerm> </place> <publisher>O'Reilly</publisher> <dateIssued>2001</dateIssued> </originInfo> <language> <languageTerm authority="iso639-2b" type="code">eng</languageTerm> </language> <physicalDescription> <form authority="marccategory">electronic resource</form> <extent>1 online resource (xii, 354 p.): ill.</extent> </physicalDescription> <note type="statement of responsibility">Erik T. Ray</note> <note type="source of description note">Description based on print version record.</note> <subject authority="lcsh"> <topic>XML (Document markup language)</topic> </subject> <identifier type="uri">http://proquest.safaribooksonline.com/0596000464</identifier> <recordInfo> <recordCreationDate>2011-04-14</recordCreationDate> </recordInfo> </mods> MODS <Example record>
MODS Guidance Page contains links to • MODS User Guidelines • MODS Note Types • Sample MODS Version 3 XML Documents • MARC Code Lists Available as Linked Data • Sources • Value lists (http://www.loc.gov/standards/mods/mods-guidance.html) MODS <guidelines>
MODS User Guidelines (Version 3) available at http://www.loc.gov/standards/mods/userguide/ Contents: Introduction and Implementation -XML Structures -Implementation Notes MODS Elements and Attributes -Top Level Elements in MODS -Attributes Used Throughout the MODS Schema MODS "Lite" MODS Full Record Examples Alphabetical Index of MODS Elements by Element Name MODS <Guidelines>
Each top level element has its own page listing its definition, attributes, and sub elements • The top level elements pages also provide guidelines, a description, examples, and mappings • Extensive guidelines enhance metadata creators’ ability to create complete, accurate, and consistent records MODS <guidelines>
One of the goals of MODS is “Accommodate conversion to and from other commonly used resource and authority description encodings” • This goal is achieved through the provision of mappings, stylesheets, and conversion tools • MODS Website Conversions page links to websites, Excel files, and XML files for the following schemes: MARC, RDA, Dublin Core, and MARCXML • http://www.loc.gov/standards/mods/mods-conversions.html MODS <conversions>
The Metadata Encoding &Transmission Standard (METS) was also developed by the Library of Congress • METS is “a standard for encoding descriptive, administrative, and structural metadata regarding objects within a digital library” (Library of Congress) • METS was designed to facilitate the management and exchange of digital objects across repositories • MODS is frequently used within the Descriptive Metadata section of a METS record • The nesting of MODS information within a METS record serves to enhance the interoperability of MODS records across repositories Mods <mets>
MODS scheme allows for the use of any controlled vocabulary • Controlled vocabularies work to enhance specificity of item records and to enhance interoperability between records using the same vocabularies • The “authority” attribute can be used with six of the top level elements to designate which controlled vocabulary is being used for that particular element. • Example: <subject authority=“lcsh”> <topic>History</topic> <geographic>United States</geographic> <subject> Controlled vocabularies
To test the effectiveness of MODS in a real world setting, three repositories were chosen from the MODS Implementation Registry • Repositories were chosen based on the availability of MODS records for public view. • Twenty-five records from each repository were analyzed for controlled vocabulary usage, completeness, accuracy, and consistency. Analysis
Copac • http://copac.ac.uk/ • Catalog containing records from 71 libraries • No guidelines for metadata usage provided on their website University of Florida Digital Collections • http://ufdc.ufl.edu/ • Over 300 distinct digital collections • All metadata built using SobekCM open source software • Website contains extensive guidelines for the use of MODS and METS in their collections Library of Congress Web Archives • http://lcweb2.loc.gov/diglib/lcwa/html/lcwa-home.html • 15 collections of archived websites • Website provides a short but detailed Technical Information page outlining metadata usage and application Repositories
80% of records used MARC Genre Term list for <genre> element • 12% of records used the <subject> element, of these, 2 used Library of Congress Subject Headings (LCSH) and 1 used uncontrolled vocabulary terms • When controlled vocabularies were used, they were implemented properly Copac <Controlled Vocabulary Usage>
5 of 20 top level elements were used in every sample record • 5 of 20 top level elements were not used in any sample records • Many elements used only in records to which they apply, ex. <language> used for written materials but not photographs • Only 12% of sample records made use of the <subject> element, this is problematic because subject searching is often a first step in the search process • None of the records used the <typeOfResource> which means that users cannot sort or browse by type Copac <metadata quality>
72% of sample records used the<subject> element, and each of these elements used LCSH • 8% of records used MARC Genre Term list for <genre> element • When controlled vocabularies were used, they were implemented properly • A majority of records using the <subject> element used the same exact terms • This makes it difficult to distinguish between collection items based on subject alone UFDC <Controlled Vocabulary Usage>
3 of 20 top level elements were used in every sample record 6 of 20 top level elements were used in no sample records Of the remaining top level elements, 5 were used in a majority of records As mentioned previously, much of the inconsistency in usage can be attributed to the fact that not all elements apply to every record UFDC sample records made extensive use of sub-elements and attributes Ufdc <metadata quality>
100% of sample records used the <subject> elementwith LCSH subject terms • Many of the records also used the Thesaurus of Graphic Materials (TGM) and uncontrolled subject terms • Several records used the LCSH Name Authority File for the <name> element • The use of controlled vocabulary terms was implemented correctly in all records examined LOC <Controlled Vocabulary Usage>
14 of 20 top level elements were used in every sample record 4 of 20 top level elements were not used in any sample records <name> and <targetAudience> were the only top level elements used in only some records <targetAudience> is not frequently determined on websites, and is an element that is likely to only be used for certain items The inconsistent use of the <name> element (only in 5 records) is troubling because one would expect some type of personal or corporate name to be associated with a majority of websites LOC <metadata quality>
All three collections contained metadata of relatively good quality • Elements were applied accurately and consistently throughout the collections. • The LOC repository is clearly the most complete and consistent, the limited scope of the collections combined with the fact that the LOC developed both the MODS scheme and the repository is the likely cause of this completeness • The UFDC and Copac repositories both lack completeness and consistency, however, the UFDC’s use of sub-elements and attributes gives it an edge over Copac • The UFDC and Copac collections contain a much wider variety of materials, which is evident in their application of metadata Comparison
Each repository examined used the MODS scheme correctly and consistently across sample records • This speaks to the effectiveness of the MODS scheme and the availability of guidelines and mapping information • The MODS element set is designed to enhance quality while allowing for flexibility. • The MODS guidelines are thorough, and the amount of elements, sub elements and attributes works to limit any semantic challenges in application of elements. • This examination has shown MODS to be a well-structured, interoperable scheme that can be used to create high quality metadata records Conclusion
Guenther, R.S. (2003). MODS: The Metadata Object Description Schema. Libraries and the Academy, 3(1),137-150. Library of Congress. (2009). Design Principles for Enhancements to MODS and MADS. Retrieved from http://www.loc.gov/standards/mods/design-principles-mods-mads.html Library of Congress. (2011). Metadata Encoding and Transmission Standard. Retrieved from http://www.loc.gov/standards/mets/ References
I certify that: · This paper/project/exam is entirely my own work. · I have not quoted the words of any other person from a printed source or a website without indicating what has been quoted and providing an appropriate citation. · I have not submitted this paper / project to satisfy the requirements of any other course. Signature Carrie E. Moran Date May 28, 2011