550 likes | 699 Views
Exploring Mutual Complementarity of Free-Text and Controlled-Vocabulary Collection-Level Subject Metadata in Large-Scale Digital Libraries. A Comparative Analysis Oksana L. Zavalina, Ph.D. , USA Presented at IFLA satellite postconference
E N D
Exploring Mutual Complementarity of Free-Text and Controlled-Vocabulary Collection-Level Subject Metadata in Large-Scale Digital Libraries A Comparative Analysis Oksana L. Zavalina, Ph.D. , USA Presented at IFLA satellite postconference “Beyond Libraries – Subject Metadata in the Digital Environment and Semantic Web” August 18, 2012, Tallinn, Estonia
Metadata • Metadata — “structured data about an object that supports functions associated with the designated object” (Greenberg, 2005) O.Zavalina'2012
Metadata • Metadata — “structured data about an object that supports functions associated with the designated object” (Greenberg, 2005) • Used in digital libraries to organize information O.Zavalina'2012
Metadata • Metadata — “structured data about an object that supports functions associated with the designated object” (Greenberg, 2005) • Used in digital libraries to organize information • Functions supported: • Find • Identify • Select • Obtain (IFLA, 1998; 2008) O.Zavalina'2012
Free-text metadata draws data values from the natural language: • Estonian capital Tallinn is located in Northern Europe • IFLA’s Classification and Indexing Section O.Zavalina'2012
Free-text metadata Controlled-vocabulary metadata draws data values from formally-maintained list of terms O.Zavalina'2012 draws data values from the natural language: • Estonian capital Tallinn is located in Northern Europe • IFLA’s Classification and Indexing Section draws data values from formally-maintained list of terms O.Zavalina'2012
Collection-level metadata • “Metadata providing a high-level description of an aggregation of individual items” (Macgregor, 2003) • Describes collection as a whole • Has long been used in archival community • Is now used in digital libraries O.Zavalina'2012 Example of collection-level metadata record from NSDL
Collection-level metadata O.Zavalina'2012 Example of collection-level metadata record from NSDL Example of collection-level metadata record from The European Library
Collection-level metadata Free-text Controlled-vocabulary O.Zavalina'2012 Example of collection-level metadata record from NSDL Example of collection-level metadata record from The European Library
Subject metadata • “information concerning what the resource is about and what it is relevant for” (Soergel, 2009) O.Zavalina'2012
Subject metadata • “information concerning what the resource is about and what it is relevant for” (Soergel, 2009) • crucial for subject access to information objects and collections in digital libraries O.Zavalina'2012
Subject metadata • “information concerning what the resource is about and what it is relevant for” (Soergel, 2009) • crucial for subject access to information objects and collections in digital libraries • Dublin Core Collections Application Profile suggests 5 collection-level subject metadata elements: O.Zavalina'2012
Subject metadata • “information concerning what the resource is about and what it is relevant for” (Soergel, 2009) • crucial for subject access to information objects and collections in digital libraries • Dublin Core Collections Application Profile suggests 5 collection-level subject metadata elements: • free-textDescription O.Zavalina'2012
Subject metadata • “information concerning what the resource is about and what it is relevant for” (Soergel, 2009) • crucial for subject access to information objects and collections in digital libraries • Dublin Core Collections Application Profile suggests 5 collection-level subject metadata elements: • free-textDescription • controlled-vocabularySubject, Type [of Object], TemporalCoverage, GeographicCoverage O.Zavalina'2012
Large-scale digital libraries Portals that • bring together hundreds of digital collections O.Zavalina'2012
Large-scale digital libraries Portals that • bring together hundreds of digital collections • provide single unique point of entry to them O.Zavalina'2012
Large-scale digital libraries Portals that • bring together hundreds of digital collections • provide single unique point of entry to them • (sometimes) provide collection-level metadata to: O.Zavalina'2012
Large-scale digital libraries Portals that • bring together hundreds of digital collections • provide single unique point of entry to them • (sometimes) provide collection-level metadata to: • give the user important contextual information for harvested items (Miller, 2000) O.Zavalina'2012
Large-scale digital libraries Portals that • bring together hundreds of digital collections • provide single unique point of entry to them • (sometimes) provide collection-level metadata to: • give the user important contextual information for harvested items (Miller, 2000) • help narrow search scope to increase precision and ease of use O.Zavalina'2012
Large-scale digital libraries Portals that • bring together hundreds of digital collections • provide single unique point of entry to them • (sometimes) provide collection-level metadata to: • give the user important contextual information for harvested items (Miller, 2000) • help narrow search scope to increase precision and ease of use • assist in information need clarification (Lee, 2003; 2005) O.Zavalina'2012
Free-text only? Examples of collection-level metadata record display for the same collection from American Memory and Opening History • Most digital libraries only create free-text collection-level metadata • default in content management systems (e.g., DSpace) • saves time of metadata creator O.Zavalina'2012
Free-text only? Examples of collection-level metadata record display for the same collection from American Memory and Opening History • Most digital libraries only create free-text collection-level metadata • default in content management systems (e.g., DSpace) • saves time of metadata creator • Even when available, controlled-vocabulary collection-level metadata is often NOT displayed to the end user O.Zavalina'2012
Metadata quality • Fitness for the purpose of supporting user tasks: • find • identify • select • obtain O.Zavalina'2012
Metadata quality • Fitness for the purpose of supporting user tasks: • find • identify • select • obtain • Quality criteria • Accuracy • Consistency • Completeness • … O.Zavalina'2012
Problem statement • Evaluation of metadata in digital libraries • is more and more important to ensure metadata quality • has not yet become a common practice (Hillmann, 2008) O.Zavalina'2012
Problem statement • Evaluation of metadata in digital libraries • is more and more important to ensure metadata quality • has not yet become a common practice (Hillmann, 2008) • Research evaluating collection-level metadata is in its infancy O.Zavalina'2012
Problem statement • Evaluation of metadata in digital libraries • is more and more important to ensure metadata quality • has not yet become a common practice (Hillmann, 2008) • Research evaluating collection-level metadata is in its infancy • Zavalina, Palmer, Jackson, and Han (2008) • Single digital library • Zavalina (2011) • compared free-text collection-level subject metadata in 3 large-scale digital libraries O.Zavalina'2012
Scope of this study • Comparative analysis of collection-level subject metadata O.Zavalina'2012
Scope of this study • Comparative analysis of collection-level subject metadata • between free-text(Description)and controlled-vocabulary (Subject, Object Type, Temporal Coverage, & Geo. Coverage) metadata fields O.Zavalina'2012
Scope of this study • Comparative analysis of collection-level subject metadata • between free-text(Description)and controlled-vocabulary (Subject, Object Type, Temporal Coverage, & Geo. Coverage) metadata fields • One-way complementarity • Two-way complementarity • Redundancy O.Zavalina'2012
Scope of this study • Comparative analysis of collection-level subject metadata • between free-text(Description)and controlled-vocabulary (Subject, Object Type, Temporal Coverage, & Geo. Coverage) metadata fields • One-way complementarity • Two-way complementarity • Redundancy • across 3 large-scale digital libraries in EU and USA O.Zavalina'2012
Findings: overall complementarity O.Zavalina'2012
Free-text to controlled-vocabulary complementarity O.Zavalina'2012
Controlled-vocabulary to free-text complementarity O.Zavalina'2012
Multiple and two-way complementarity • ↑↑↑ In 22% of records free-text Descriptioncomplemented 2+ controlled-vocabulary subject metadata fields. • ↔ Two-way complementarity occurred in 40% of records • mostly between Description and Subject O.Zavalina'2012
O.Zavalina'2012 • Descriptioncomplements Subject: • topical information (“foodways, religious traditions, Native American culture, maritime traditions, ethnic folk culture, material culture”)
O.Zavalina'2012 • Descriptioncomplements Subject: • topical information (“foodways, religious traditions, Native American culture, maritime traditions, ethnic folk culture, material culture”) • occupational subject information (“musicians, craftpersons, storytellers, folklife interpreters”)
Descriptioncomplements Subject: • topical information (“foodways, religious traditions, Native American culture, maritime traditions, ethnic folk culture, material culture”) • occupational subject information (“musicians, craftpersons, storytellers, folklife interpreters”) • Descriptionspecifies dates in Temporal Coverage • “1930 through 2011”. O.Zavalina'2012
O.Zavalina'2012 • Descriptioncomplements Subject: • topical information (“foodways, religious traditions, Native American culture, maritime traditions, ethnic folk culture, material culture”) • occupational subject information (“musicians, craftpersons, storytellers, folklife interpreters”) • Descriptionspecifies dates in Temporal Coverage • “1930 through 2011”. • Descriptioncomplements Object Type: • genre information (“children’s lore,” “occupational lore,” “performances,” “interviews,” “surveys”).
Subject lists additional topics not covered by Description • (e.g., “Architecture”). O.Zavalina'2012
O.Zavalina'2012 • Subject lists additional topics not covered by Description • (e.g., “Architecture”). • Geographic Coverage provides spatial information absent in Description • United States (nation) • Southern U.S. (general region) • Florida (state).
More examples of two-way complementarity ↔ ↔ • “towns of Coal City, Braidwood, and Wilmington” in Description • “dance instruction manuals, anti-dance manuals, histories, treatises on etiquette” in Description • “Illinois (state), Grundy (county)” in Geographic Coverage • “Ballroom dancing—United States” in Subjects O.Zavalina'2012
More examples of two-way complementarity ↔ ↔ • “newspaper photographs” in Description • “contemporary, … European age of chivalry, … prior to 1900” in Description • “photographs; archival finding aids” in Object Type • “1200-1900” inTemporal Coverage O.Zavalina'2012
Redundancy • Only in European Library (19%) • Examples: O.Zavalina'2012
Conclusions • High complementarity & little redundancy in collection-level subject metadata O.Zavalina'2012
Conclusions • High complementarity & little redundancy in collection-level subject metadata • More detailed collection-level metadata records • include BOTH free-text and controlled- vocabulary subject metadata • mutually complementary O.Zavalina'2012
Conclusions • High complementarity & little redundancy in collection-level subject metadata • More detailed collection-level metadata records • include BOTH free-text and controlled- vocabulary subject metadata • mutually complementary • allow more fully representing intellectual content of information objects O.Zavalina'2012
Conclusions • High complementarity & little redundancy in collection-level subject metadata • More detailed collection-level metadata records • include BOTH free-text and controlled- vocabulary subject metadata • mutually complementary • allow more fully representing intellectual content of information objects • ultimately improve subject access for the users. O.Zavalina'2012
Conclusions Guidelines for creating high-quality collection-level subject metadata • are not currently available O.Zavalina'2012
Conclusions Guidelines for creating high-quality collection-level subject metadata • are not currently available • can be incorporated in • National standards • Framework of Guidance for Building Good Digital Collections (US NISO) O.Zavalina'2012