300 likes | 390 Views
Raising the digital barn: cooperation and autonomy at Harvard. Robin Wendler Metadata Analyst, Harvard University Five Colleges Retreat July 12, 2005. “Once a large technology is made from sufficiently intertwined parts, there is no way to order an exposition
E N D
Raising the digital barn:cooperation and autonomy at Harvard Robin Wendler Metadata Analyst, Harvard University Five Colleges Retreat July 12, 2005
“Once a large technology is made from sufficiently intertwined parts, there is no way to order an exposition of it such that strongly-connected ideas are always close together. Spaghetti doesn’t want to be free.” --Rick Jelliffe
Call for action • Universities have vast information assets • Significant proportion of new assets will be digital • Managing digital assets appropriately needs to become routine, and… • Requires organizational response plus new infrastructure
Digital is different • Need not be tied to one location • Can be consulted by many at once • Multimedia • Additional capabilities (e.g. searching, manipulation) • But… • Unit price can be higher • Often sold in large, complex, costly packages • Requires complex technical infrastructure • Requires active maintenance and preservation
Impetus for cooperation • Consistent, coherent services for users • Sharing expertise • Sharing common infrastructure • Reduces redundant effort • Reduces system integration tasks • Economies of scale in hardware, software • No exclusivity clause: • No compulsion to participate • Participation does not preclude local development Tug of war team, photograph, 1888 Harvard University Archives, HUPSF Tug o' War (6)
Library Digital Initiative • Not “Digital Library Initiative” • 1997: University-level working group • Librarians, provost’s office, deans’ staff • Recommendation to Harvard Corporation • http://hul.harvard.edu/ldi/html/ldi_origins.html • 1998: LDI launched with 5-year funding for • Infrastructure development • Specialist expertise • Collection-building • Staff education and experience
Infrastructure Development • Catalogs • Digital repository • Delivery services • Access management • Name resolution service • Library and user services • E.g., E-reserves, Portal, MetaLib, SFX
Catalogs: Bringing underserved materials to light • Images • VIA: Union catalog for material culture images and objects • OLIVIA: shared “backroom” cataloging system • Harvard Geospatial Library • OASIS: Finding aid union catalog for archival and manuscript collections • “Templated dbs” – roll your own within a common basic framework • Biomedical images • Milman Parry • Biological specimen illustration (“fish watercolors”) • In process: Course catalog archive, Iranian Oral History
Digital Repositories in general • A repository stores and manages digital content and provides related functions, but which functions (e.g. discovery, versioning, subsetting, rendering, rights management, etc.) depends • Digital object modeling: • `What’s the “object”? A file? A set of files that make up a book? Multiple resolutions of an image? • Preservation/service levels • Bit-level preservation • Migration • At point of obsolescence vs. on demand • Emulation
Harvard’s Digital Repository Service • DRS stores digital content • Discovery happens in external catalogs • Current policies: • Any Harvard unit may use the repository, but… • Content must be • “library-like”: intellectual assets held by a unit of the institution • In an approved format • Deemed to have long-term (decades or centuries) value • Developing submission agreements to formalize rights and responsibilities of individual depositors and repository • July 2005: As a result of content grants, Digital Repository contains • 2.8 million objects • 10.1 terabytes of data • Lots more in the pipeline
CATALOG DATA SOURCES Other internal: Embark, etc. Image cataloging system Outside sources: OCLC, RLIN, ICPSR SGML/XML editors ArcCatalog ILS ACCESS PORTAL Multi-Catalog Access Collection Web Sites Visual Resources Catalog Geo- spatial Catalog Biomed Image Catalog SocSci Data Catalog Finding Aids Catalog A & I Databases Other Catalogs OPAC Full-Text Search ACCESS MANAGEMENT COMMON SERVICES NAMING DELIVERY SERVICES COLLECTION Statistical Data Interface Single Image Page Turner Audio Streaming GIS Interface Multi-media Geospatial Repository Statistical Data Repository Other Local and External REPOSITORIES HUL Repository CONTENT SOURCES Image Conversion Services (+ OCR) Audio Conversion Services “Born-Digital” Outside Services
Specialist advisors • Reformatting • Metadata • Licensing / Legal issues • Digital archiving and preservation* • Project management • Included in content grants and for-hire * 1998 was too soon; not filled
Reformatting concerns • Appropriate outcomes for source materials • Quality of reproduction • Optimization for preservation • Optimization for network delivery • Cost control • Workflow
Metadata concerns • Supporting specified functionality • Meeting community requirements for sharing and reuse (Standards!) • Designing for longevity, portability (Standards!) • Cost control • Workflow
Collection Building • Internal grants • Content sources • Conversion from current collections • Materials created digitally within the university • Commercial sources • Proposed content projects had to • Contribute to building critical mass of content of persistent intellectual value in limited topical areas • Utilize or further develop technical infrastructure • Have library sponsorship, regardless of source of proposal or source of materials • 7 grant rounds (27 content projects, 9 access projects)
Committees • LDI Steering Committee • LDI Grant Committee • Oversight committees cover • Visual Systems (VIA/OLIVIA) • OASIS (finding aid catalog) • Harvard Geospatial Library • Infrastructure/delivery systems (DRS, Naming, etc.) • Portal/federated searching • Digital Acquisitions • LDI Architecture Review • others… • Development priorities are set by University Library Council voting, with some infrastructure pieces deemed required and not subject to vote • Where appropriate, cmtes include faculty and/or staff from museums and archives, central IT, provost’s office, and academic computing.
A word about preservation • Whether you think you are in the preservation business or not, you are! • Digital stuff goes bad • How will you know? • What will you do? • Requires both organizational and technical response
Challenges • How to carve out resources to take on new tasks • Filling the infrastructure • Systems integration • Policy development and organizational response to digital content • Selection (restricting on format, control, source, rights, etc.) • Licensing/acquisition • Cataloging • Preservation • Retention decisions • Who pays,and for what? • Not a one-time fix; need to “institutionalize flexibility”
Academic Computing • I-Commons • Provost initiative to provide courseware components • Instructor’s Toolkit • Video presentation tool • Polling tool • I-Sites • President’s Information Technology Fellows • work one-on-one with faculty to develop digital course materials
Not everything is a nail • Different organizational imperatives • Coordination with like units at other institutions is more important than coordination with unlike units at Harvard • Functional requirements differ • Granularity of what is described varies • Metadata varies from community to community • descriptive practices vary • terminology varies • technical formats for metadata exchange vary • Community has specific technical infrastructure, data format and metadata standards of its own • E.g. • Astrophysics: Astrophysics Data System, FITS data format • Botany
LDI 2 • 2004: External review of LDI • Extremely positive • Recommendation to Corporation: • Fund it again • Recommendations to LDI: • Insure that Harvard students and faculty value digital resource LDI has created • Integrate LDI infrastructure and content with other digital environments at Harvard, particularly course management • Do better measurement and assessment
LDI 2 • 2004: New 5-year internal grant • Two-part program • Infrastructure creation and maintenance • Initiative areas: • Integration projects with academic computing initiatives • Targeted digital resource development • Archiving born-digital materials • Digital preservation • Assessment and measurement
Links • Library Digital Initiative • http://hul.harvard.edu/ldi/ • Harvard Libraries portal • http://lib.harvard.edu/ • VIA (Visual Information Access) • http://via.harvard.edu/ • OASIS (Finding aid union catalog) • http://oasis.harvard.edu • Harvard Geospatial Library • http://nrs.harvard.edu/urn-3:hul.eresource:hgeodesy • Harvard-MIT Data Center • http://vdc.hmdc.harvard.edu/VDC/
Links, continued • Open Collections Program • http://ocp.hul.harvard.edu/ • Women Working • http://ocp.hul.harvard.edu/ww/ • Ted Databases: • Milman Parry Collection of Oral Literature • http://nrs.harvard.edu/urn-3:hul.eresource:milparco • Biomedical Image Library (almost no content) • http://nrs.harvard.edu/urn-3:hul.eresource:bioimlib • Museum of Comparative Zoology artwork (has content, but not “in production” yet) • http://ted.hul.harvard.edu:8080/ted/deliver/home?_collection=mcz
Links, take 3 • JHOVE (JSTOR/Harvard Object Validation Environment) • http://hul.harvard.edu/jhove/ • Global Digital Format Registry planning site • http://hul.harvard.edu/gdfr/ • PREMIS (Preservation metadata) • http://www.oclc.org/research/projects/pmwg/default.htm • METS (Metadata Encoding & Transmission Standard • http://www.loc.gov/standards/mets/
r_wendler@harvard.edu Thank you! Computation lab at night [photograph SC207], ca. 1947 Harvard University ArchivesUAV 605 (SC207)