300 likes | 417 Views
IMLS NLG Collection Registry & Item-Level Metadata Repository at the University of Illinois. Timothy W. Cole (t-cole3@uiuc.edu) Mathematics Librarian & Professor of Library Administration University of Illinois at Urbana-Champaign (USA)
E N D
IMLS NLG Collection Registry & Item-Level Metadata Repository at the University of Illinois Timothy W. Cole (t-cole3@uiuc.edu)Mathematics Librarian & Professor of Library AdministrationUniversity of Illinois at Urbana-Champaign (USA) Open Archives Forum WorkshopUniversity of Bath4 September 2003 http://dli.grainger.uiuc.edu/Publications/TWCole/OAForumWkshpBath/
IMLS NLG Program • Institute of Museum and Library Services (IMLS) • U.S. Federal grant-making agency, est. 1996 • Goal to foster leadership, innovation, lifetime learning • $244 million annual budget • IMLS National Leadership Grant Program • Currently about $20 million per year • Library, Museum, & Library-Museum Collaborations • Funds research & demonstration, digitization, preservation, model programs, new technology t-cole3@uiuc.eduUniversity of Illinois at UC
IMLS Digital Collections Framework “IMLS Framework of Guidance for Building Good Digital Collections” published November 2001 http://www.imls.gov/pubs/forumframework.htm • Product of 8-member IMLS Digital Library Forum, with participation from National Science Digital Library (NSF) • Differentiates digital collections & digital libraries • Articulates principles & frames discussion of best practices • Links to resources, models, & exemplary projects • Will be sustained by National Information Standards Org. t-cole3@uiuc.eduUniversity of Illinois at UC
Recommendations from the IMLS Forum Four General Recommendations to IMLS: • Digital collections built with support of public funds can and should be held to standards that support interoperability, reusability, and persistence. • IMLS should maintain its own registry of funded digital collections. • Because so much of the IMLS constituency consists of small and medium-sized organizations without sophisticated in-house technical support, the IMLS should also consider projects to develop infrastructure services that lower barriers to NSDL contribution by smaller organizations. • IMLS should encourage the integration of an archiving component into every project plan by requiring a description of how data will be preserved. t-cole3@uiuc.eduUniversity of Illinois at UC
Collection description and registry for National Leadership grant projects with digital content • Enhance discoverability; all registry fields searchable • Item level metadata repository via OAI-PMH • Demonstrate potential of metadata for interoperability • Facilitate reuse of information resources • Research question:How can resource developers best represent collections and items to meet the needs of service providers and end users? Project Website: http://imlsdcc.grainger.uiuc.edu/ t-cole3@uiuc.eduUniversity of Illinois at UC
Project Scope • 95 NLG projects with associated digital collections • 51 of these are/were collaborative projects • All together 237 institutions involved t-cole3@uiuc.eduUniversity of Illinois at UC
A Diverse Community • Wide variation in technical skills and technology infrastructure & policy • Mix of library, museum, and archive traditions • Diverse perspectives on IP policy, use and presentation of metadata and primary resources • Diverse embedded knowledge structures • Wide range of vocabularies and descriptive practices • Metadata created for diverse purposes • Local vocabularies for type, subject, coverage, audience • Wide range of granularity t-cole3@uiuc.eduUniversity of Illinois at UC
Prior Work – Mellon OAI Grants • July 2001, Andrew W. Mellon Foundation awarded 7 grants for OAI-related research ($1.5 mil. total) • Primary focus: demonstrate utility of OAI metadata harvesting in context of scholarly inquiry • Research Library Group (RLG)University of MichiganUniversity of Illinois at Urbana-ChampaignEmory University / Southeastern Library NetworkWoodrow Wilson International CenterUniversity of Virginia See: http://www.arl.org/newsltr/217/waters.html t-cole3@uiuc.eduUniversity of Illinois at UC
University of Illinois Mellon OAI Project • July 2001 – May 2003 • Primary Objectives: • Create & demonstrate OAI tools • Build portal to aggregated metadata describing cultural heritage resources • Initially – For OAI testing & research • Long-term – As a sustained resource • Investigate using EAD metadata in OAI context • Research utility of aggregated metadata t-cole3@uiuc.eduUniversity of Illinois at UC
University of Illinois Cultural Heritage Portal • Harvests 25 OAI Providers • Academic libraries & archives • Digital library projects • Historical societies • Aggregates 479,000 metadata items • 55% text / sheet music • 40% image / multimedia • 5% archival / museum http://oai.grainger.uiuc.edu t-cole3@uiuc.eduUniversity of Illinois at UC
Current Projects Addressing Similar Issues NSDL Digital library of resource collections and services, organized in support of science education at all levels. NOF-Digitize /EnrichUK Description and aggregation of digitized collections funded by the New Opportunities Fund Minerva Project Creating an agreed European common platform, recommendations and guidelines about digitization, metadata, long-term accessibility and preservation t-cole3@uiuc.eduUniversity of Illinois at UC
Technical Challenges • NLG Awardees have diverse technical resources • Limited knowledge of / tools for working with XML • Limited knowledge of community metadata schemas • Limited knowledge of / access to CGI capabilities • Early NLG projects have no resources earmarked for sharing metadata • Technical implementations not always built with reuse and interoperability in mind t-cole3@uiuc.eduUniversity of Illinois at UC
OAI Readiness Among NLG Projects t-cole3@uiuc.eduUniversity of Illinois at UC
OAI for Static Repositories • Lower barrier option for exposing relatively static and small collections of metadata • Designed to scale well to about 5,000 metadata records • Provider serves static XML file (no CGI required) • 3rd party gateway generates valid OAI responses • Supports only a subset of OAI options • No sets, deleted records, resumptionTokens • DateStamp granularity limited to YYYY-MM-DD Preliminary alpha version of OAI-SR guidelines available: http://www.openarchives.org/OAI/2.0/guidelines-static-repository.htm t-cole3@uiuc.eduUniversity of Illinois at UC
OAI Static Repository Gateways • SR Gateways support CGI extended path • SR Gateways typically cache static repository XML files • SR Gateway lists all SRs available through gateway in <friends> element (dynamic discovery of SRs) • SR Gateways assumed to support automatedself-registration of SRs • SRs should make themselves available through asingle SR Gateway • SR Gateway applications available on SourceForge.net 12 t-cole3@uiuc.eduUniversity of Illinois at UC
Working with Turnkey Solutions • OAI provider service now built into many popular digital library applications • ContentDM, Encompass, DLXS, DSpace, EPrints.org • Facilitates participation in OAI-PMH metadata sharing • Some implementations may be limited • Many support oai_dc metadata schema only • May have limited feature set (e.g, no resumptionToken) • Metadata mappings may not be configurable • Community needs to advocate requirements strongly t-cole3@uiuc.eduUniversity of Illinois at UC
Metadata Issues • Wide range of metadata schemas in use • Variations in Descriptive practices & traditions • Use of Dublin Core fields • Granularity • What is being described • Different approaches to IP rights issues t-cole3@uiuc.eduUniversity of Illinois at UC
Metadata Schemas Used By NLG Projects • MARC and Dublin Core most common schemas • Includes qualified DC & DC with extra fields • 24 projects - multiple schemas • 14 of these using Dublin Core in combination with another schema t-cole3@uiuc.eduUniversity of Illinois at UC
DC element usage (from Mellon) • Records containing subject & description element • Many different controlled and local vocabularies in use • Granularity: a record may describe a collection of coins — or one coin t-cole3@uiuc.eduUniversity of Illinois at UC
Description:Digital image of a single-sized cotton coverlet for a bed with embroidered butterfly design. Handmade by Anna F. Ginsberg Hayutin. Source:Materials: cotton and embroidery floss. Dimensions: 71 in. x 86 in. Markings: top right hand corner has 1 1/2 in. x 1/2 in. label cut outs at upper left and right hand side for head board; fabric is woven in a variation of a rib weave; color each of yellow and gray; hand-embroidered cotton butterflies and flowers from two shades of each color of embroidery floss - blue, pink, green and purple and single top 20 in. bordered with blue and black cotton embroidery thread; stitches used for embroidery: running stitch, chain stitch, French knot and back stitches; selvage edges left unfinished; lower edges turned under and finished with large gray running stitches made with embroidery floss. Format:Epson Expression 836 XL Scanner with Adobe Photoshop version 5.5; 300 dpi; 21-53K bytes. Available via the World Wide Web. Coverage:— Date Created: 2001-09-19 09:45:18; Updated: 20011107162451; Created: 2001-04-05; Created: 1912-1920? Type:Image Describe the digital object?Excerpt of record describing a cotton coverlet t-cole3@uiuc.eduUniversity of Illinois at UC
Or describe the analog object?Excerpt of record describing Am. woven coverlet Description:Materials: Textile--Multi, Pigment—Dye; Manufacturing Process: Weaving--Hand, Spinning, Dyeing, Hand-loomed blue wool and white linen coverlet, worked in overshot weave in plain geometric variant of a checkerboard pattern.Coverlet is constructed from finely spun, indigo-dyed wool and undyed linen, woven with considerable skill. Although the pattern is simpler, the overall craftsmanship is higher than 1934.01.0094A. - D. Schrishuhn, 11/19/99 This coverlet is an example of early "overshot" weaving construction, probably dating to the 1820's and is not attributable to any particular weaver. -- Georgette Meredith, 10/9/1973 Source:— Format:228 x 169 x 1.2 cm (1,629 g) Coverage:Euro-American; America, North; United States; Indiana? Illinois? Date:Early 19th c. CE Type:cultural; physical object; original t-cole3@uiuc.eduUniversity of Illinois at UC
Various Concerns About IP Rights • Overcoming reluctance to share metadata because of IP rights issues • Concern that sharing metadata is giving away most valuable asset • Uncertain whether license limits metadata sharing • Uncertain whether to share metadata describing licensed information resources • Machine readable IP rights attributes Needed to facilitate reuse t-cole3@uiuc.eduUniversity of Illinois at UC
Portal Design Issues • How best to organize aggregated metadata for browse • Need scalable ways to build / implement classifications • Need better methods for clustering and grouping • Utilize relationships & ties to collection descriptions • How best to implement basic & advanced searching • Precise searching hard due to metadata usage variations • Limited normalization possible; more work needed • Robust search & ranking across large aggregations hard • Need more audience-specific designs • Need more dynamic & interactive designs • Need better support of educational & instructional uses t-cole3@uiuc.eduUniversity of Illinois at UC
Portal Design – Mellon Project Experience • Limited focus group testing • 23 student teachers in honors-level C & I class • Assignment to students: Use the site in preparing a lesson plan for high school social studies class • Process • Introduced site & “aggregated metadata” concept • Focus group interviews conducted • Students’ papers examined • Transaction logs analyzed t-cole3@uiuc.eduUniversity of Illinois at UC
A Few Observations from Test 1. Users expected all links to point to digital objects • Some records pointed to finding aids • Some records pointed to collection’s web site • Some records pointed to Library books on the shelf 2. Users unable to make use of search results • Simple searches produced 1000s of unranked results • Advanced search (with limits) rarely used 3. Distinction between portal and data providers unimportant to users t-cole3@uiuc.eduUniversity of Illinois at UC
Rethinking what “online access” means • To librarian & curator • To student teacher t-cole3@uiuc.eduUniversity of Illinois at UC
Closing Thought – Considering OAI in Context • Descriptive, item-level metadata alone insufficient • Must be used in combination with collection descriptions, user annotations, machine generated clustering, … • Distinction between collection & item blurs in DLs • Complex objects – TEI, EAD, METS • Granularity – should museum describe every arrowhead in end-user search & discovery system • Relationships between items provide context • Need to tie collection registry to item-level repository • OAI-PMH not limited to item-level descriptive metadata t-cole3@uiuc.eduUniversity of Illinois at UC