280 likes | 399 Views
Rethinking Assumptions with the Our Americas Archive Partnership (OAAP). Geneva Henry Rice University 6 April 2009 CNI Spring 2009 Task Force Meeting, Minneapolis, MN. Presentation overview. About the Our Americas Archive Partnership project Vision and goals
E N D
Rethinking Assumptions with the Our Americas Archive Partnership (OAAP) Geneva Henry Rice University 6 April 2009 CNI Spring 2009 Task Force Meeting, Minneapolis, MN
Presentation overview • About the Our Americas Archive Partnership project • Vision and goals • Approach we’re taking with the development • Building the collections • Assumption regarding growth • Assumptions regarding metadata • Challenges
Background • Our Americas Archive Partnership (OAAP) awarded to Rice by the Institute of Museum and Library Services -- IMLS • National Leadership Grant for digitization • Digitize selected items in Woodson’s Americas collection • Add Web 2.0 technologies to enable use of Rice collection and University of Maryland’s complimentary Early Americas Digital Archive collection
The Partnership • Rice University • Fondren Library • Humanities Research Center (HRC) • Digitization, transcriptions, translations, metadata, markup, research modules, scholarly introductions • University of Maryland • Maryland Institute for Technology in the Humanities (MITH) • Integration of collections, development of web 2.0 features including social tagging and a geospatial interface • Addition of Instituto Mora, Mexico City • rich collection of materials relating to the socioeconomic and historical conditions of Mexico • Not part of IMLS grant • Collaborative relationship with Rice’s HRC
Description of Collections • Early Americas Digital Archive (EADA1492-1820) • a collection of electronic texts of transcribed literary-historical narratives written in or about the Americas • Rice Americas Digital Archive (1597-1920) • includes approximately 25,000 pages of original letters, broadsides, pamphlets, printed materials and books documenting the political and cultural relationships between the United States, Mexico, Central and South America, Cuba, Spain, and Portugal • Instituto Mora Collection • 7000 pages of additional archival items scanned, digitized, marked up, and fully integrated into the search tools • Scanning started June 2008 and will continue through summer 2009
Our Vision • Focus on Americas from a hemispheric perspective rather than the nation state, driven by scholars’ needs • Span of OAAP captures cultural transformation that spans the five hundred year period that saw the making of modern and colonial cultures in the Americas • OAAP will impact the study of American literary and cultural history by more easily allowing scholars to understand cross-cultural influence
Goals • Create unique new research and teaching opportunities • Make unique archival collection digitally available • Build common interface between partners’ repositories, enabling additional digital archives to be added • Address issues associated with the complexity of multilingual documents
Ubiquitous discovery opens new horizons • OAAP supports new scholarly inquiry into understanding the development of the Americas • Unrestricted access to scholarly resources that were previously only in nation-specific collections at a variety of institutions • Collaboration that crosses institutions, crosses countries, and will grow as scholars need it to grow • Power to the scholars
Federation Model • Provide a common interface to multiple repositories with different content management approaches • search page allowing for multifaceted browsing • MySQL database built from harvested content • Federated digital environment allowing institutional partners to share holdings while retaining individual identity • Extensible to allow for folksonomic tagging
Custom repository DSpace Technical approach • Technically diverse digital collections • Digital assets stored in separate repositories • Technical Approach • Capture meta data as Dublin Core • Convert TEI-marked documents in EADA to Dublin Core and harvest repositories • Texts encoded in TEI-Light • Social tagging by scholars using their vocabulary Develop common descriptors Metadata harvesting Common text display
DSpace Platform • DSpace is one of the leading open source software platforms for an institutional repository • Rice’s Digital Scholarship Archive uses DSpace, with some significant customizations • Provides permanent digital archive for materials • Fine-grained access controls • Metadata separate from actual objects allows for scalability of digital assets
Overview of DSpace Architecture • Web-based user interface • Runs on Unix-based OS; Rice’s is running on Apple Xserves • Production server for final collections • Development and Test servers for preparation • Uses PostGres database for managing content • Includes Lucene search engine • Support for full text search • Supports Dublin Core metadata standard • Metadata harvested by OAI harvesters • Storage demands are VERY high • Using Isilon clustered storage solution to facilitate multimedia
Connexions • Provide scholarly analysis of the archival documents or demonstrate their pedagogical uses in an on-line environment • Connexions is a set of tools for developing and freely distributing educational material
Using archival materials Scanned images immediately provide visual cues as to the type of document • a letter versus a governmental document
Multilingual documents Translations expand access to intellectual content of texts • By providing the content in language of the reader And • In a format that facilitates visual scanning of content and full text searching
Enhancing Multilingual documents Digital Image > Transcription > Translation
Example Item Record • TEI file • Digital Image • Metadata
Outcomes • Allow scholarly examination of American literature from a hemispheric perspective, • develop a collection of texts, curricular models and teaching materials that embody a hemispheric approach to the study of the early Americas • generate professional and intellectual exchanges among scholars from various fields • Support Scholars from outside the US and their contributions • Create digitized version of primary sources not previously available to wide range and physically dispersed audience • Support addition of other digital archives with minimal barrier to entry
Growth Assumptions • Architectural approach assumed new partners would host their own digital collections • Assumed familiarity with digitization practices • Sustainability of collection assumed to be responsibility of each partner • Assumed at least some level of processing (minimal) to be a contributing partner
Assumptions regarding metadata • Dublin core was assumed acceptable for partnering • Following metadata best practices viewed as a good thing when project started • Markup of text documents seen as valuable enhancement • Geospatial information to support geospatial visualization of resources thought to be valuable to scholars
Collection Challenges • Latin American institutions have rich collections but limited experience and resources with digitization • Hosting collections presents issues of sustainability • Should hosted collections follow practices of collection at hosting institution?
Metadata challenges • New scholarly approach to understanding historic documents relies on new descriptions • Cataloging/metadata best practices impose a previous organizational bias • Deciding what geographic information is relevant is not so straight-forward • Scholars interested in shifting borders; geospatial presentation of little value to them • Should minimal metadata with full text search be the new model for supporting digital scholarship?
Project Website • Website: http://oaap.rice.edu • Updates on project developments • Share team presentations to communities • Share scripts and code for future participants • Rice and Mora Americas collection at http://scholarship.rice.edu/handle/1911/9219 • EADA at http://www.mith2.umd.edu/eada/
Thank You and come visit us on the web Contacts: Geneva Henry, PI (Rice) ghenry@rice.edu Caroline Levander, Co-PI (Rice) clevande@rice.edu Neil Fraistat, Co-PI (MITH) nfraistat@gmail.com