420 likes | 543 Views
MAINTAINING QUALITY METADATA: TOWARD EFFECTIVE DIGITAL RESOURCE LIFECYCLE MANAGEMENT. Daniel Gelaw Alemneh University of North Texas. University of North Texas (UNT) Libraries Digital Initiatives. Collaborative Initiatives CyberCemetery GPO NARA – Affiliated Archive Texas Register Archive
E N D
MAINTAINING QUALITY METADATA: TOWARD EFFECTIVE DIGITAL RESOURCE LIFECYCLE MANAGEMENT Daniel Gelaw AlemnehUniversity of North Texas
University of North Texas (UNT) Libraries Digital Initiatives • Collaborative Initiatives • CyberCemetery • GPO • NARA – Affiliated Archive • Texas Register Archive • Secretary of State’s Office • Texas Laws and Resolutions Archive • Secretary of State’s Office • The Portal to Texas History • 45 Libraries & Museums • Web-at-Risk Project • California Digital Library • New York University • National Digital Newspaper Program (NDNP) • Between 1836 and 1922. ICKM 2008
University of North Texas (UNT) Libraries Digital Initiatives • Library Digital Collections: • Congressional Research Service Archive • 10,000+ CRS Reports • World War Poster Collection • 500 WWI and WWII Posters • Advisory Commission on Intergovernmental Relations • 408 reports = 47,874 pages • Federal Communications Commission (FCC) Record • 136 issues = 43,115 pages (6 of 21 volumes completed) • Electronic Theses and Dissertations (ETDs) • 3000+ more in queue • Jean-Baptiste Lully (Music) Collection • 27 scores = 10,000 pages • Other digitization projects • http://www.library.unt.edu/libraries-and-collections/digital-collections ICKM 2008
Metadata Environment • Metadata-based digital resource management activities • UNT Libraries metadata locally qualified Dublin Core based descriptive metadata. • Detailed technical and preservation metadata elements • Web based metadata creation and editing • Interoperability • Metadata Crosswalks • Mods • Marc • oai_dc • PREMIS ICKM 2008
Metadata Quality • The two aspects of digital library data quality: • The quality of the data in the objects themselves • The quality of the metadata associated with the objects • Poor metadata quality: • Ambiguities • Poor recall • Poor precision • Inconsistency of search results ICKM 2008
Metadata Quality … • Most Common errors: • Incorrect Data: • Letter transposition • Letter omission • Letter insertion • Letter substitution or misstrokes • Missing Data • Elements and values not present at all (null) • Insufficient or incomplete data • Ambiguous Data • Confusing or inconsistent data e.g. multiple spellings, multiple possible meanings, mixed cases, initials, etc. ICKM 2008
Factors Influencing Metadata Quality • Local Requirements: • Objects Heterogeneity • What type of objects will the repository contain? • Granularity • How will they be described? • Functionality • What functionality is required? • How will it be interfaced? ICKM 2008
Factors Influencing Metadata Quality … • Collaborative Requirements: • Diversity of Users • How best diverse information-seeking behaviors can be met? • Interoperability • Will metadata be meaningful within aggregations of various kinds? • What is required for interoperability? (Structure, semantics, & syntax) • Digital rights issues • Will access restrictions be imposed? • Are requirements formal or informal? • Are there other access and associated digital rights issues? ICKM 2008
Factors Influencing Metadata Quality… • Training Issues • Necessary expertise to create and manage rigorous metadata • Metadata quality can be determined to a great extent by: • knowledge of the source, and • knowledge of the methodology used to create the statement • Cost • Rigorous metadata is resource intensive and too costly ICKM 2008
UNT Metadata Quality Assurance Mechanisms & Tools • The two main stages of metadata qualities assurances: • Pre-injust • 1. Metadata Creation tools (Templates) • Post-injust • 2. Metadata Analysis tools (Web-based tools) ICKM 2008
Quality Assurance Mechanisms and Tools: Templates • Metadata Creation Tools (Templates) • Validates Mandatory elements • Metadata Template Creator • Template Reader • Controlled vocabularies (UNTLBS) ICKM 2008
UNT Metadata Quality Assurance Mechanisms & Tools… • 2. Metadata Analysis Tools • NULL Values • List/Browse All Values (by each qualifiers and elements) • List Authorities Values • Graphical reports and other fun stuff • Clickable Maps by Institution and Collection • Word Clouds by elements • Records added overtime and other graphical reports ICKM 2008
Summary • Determine level of quality required • Partners may have much in common, but they have diverse and sometimes conflicting metadata requirements. • Determine nature of gap and how to close it • effectiveness, efficiency, practicability, scalability • Machine verses human error handling • How much of the process can be automated? • Human review of results is still essential (e.g. highlighted items) • Compromise • One size does not fit all! • Prioritize • Resources very unlikely to be available to meet all requirements • Test the workflow • Test, retest, and evaluate the quality cycle continuously ICKM 2008
Questions?Daniel.alemneh@unt.edu Thank You! ICKM 2008