1 / 18

Digital Archiving at Elsevier

Digital Archiving at Elsevier. Joep Verheggen, ScienceDirect ICSTI Conference, London, 17 May 2004 . Agenda. Short introduction about Elsevier Archiving; why is this so important and what is our position “YOAS” project “Technical aspects” Note: this presentation focusses on journal content.

Download Presentation

Digital Archiving at Elsevier

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Digital Archiving at Elsevier Joep Verheggen, ScienceDirect ICSTI Conference, London, 17 May 2004

  2. Agenda • Short introduction about Elsevier • Archiving; why is this so important and what is our position • “YOAS” project • “Technical aspects” Note: this presentation focusses on journal content

  3. Elsevier vision... …to deliver superior information products and services that provide solutions for scientists, medical professionals and librarians ...

  4. Archiving terminology • there can be confusion when talking of archives between: • (1) ongoing access to current services and • (2) long-term storage and preservation of the intellectual content • we provide for both in our licenses • this presentation primarily related to (2)

  5. Long-term preservation • significance of going “e-only” • many university and corporate libraries have cancelled paper and use electronic only -- and this is increasing weekly • e-only puts greater pressure on archival preservation -- and archiving of both the print and the electronic versions • archiving high on the agenda of individual libraries and library groups

  6. Responsibility for archiving • Elsevier takes digital archiving seriously • responsibility to authors • responsibility for maintaining “the minutes of science” • importance to the library community • interest in maintaining an asset

  7. Broad range of actions • have participated in discussions, projects and committees related to digital archiving since 1995 • among the first (after AIP) to make public archiving commitment and perhaps the first to incorporate it in our license • currently making multi-million dollar investment in internal back-up systems

  8. Current license language • since 1999, all ScienceDirect licenses for online service contains an annex specifying: • we will maintain a permanent archive of the SD journals we own • we will migrate the archive as the technology used for storage or access changes • we will transfer the archive to an independent, librarian-approved depository if we cannot maintain it

  9. Sizing the problem • there are more than 1800 Elsevier journals on ScienceDirect • we are retrodigitizing: creating digital backfiles from v. 1, n. 1 on all titles • expect to have more than 6 million articles on ScienceDirect by the end of this year • original size estimate of total file: 50 million pages, 6.5 to 7 terabytes • Project started in 2001, completed in 2004

  10. Types of archives • internal production “archive” Electronic Warehouse, not ScienceDirect • “defacto archives” about 10 regular ScienceDirect OnSite (SDOS) customers worldwide who get everything or nearly everything for local loading (but make no archiving commitment beyond their constituency)

  11. Types of archives -- continued • self-designated “national” archives libraries or other institutions that choose to maintain an archival copy locally as a national security measure; variation on SDOS license • “official Elsevier archive” formal, contractual relationship between Elsevier and a trusted archival institution to provide permanent retention and access to the digital files for future generations

  12. Official Elsevier archives • we did an investigative project with Yale University Library (with funding from the Mellon Foundation) which was completed in early 2002 • signed the first formal agreement for an official archive with the Koninklijke Bibliotheek (KB) in August, 2002 • likely to do 3-4 additional agreements (in North America, Asia and Europe)

  13. Koninklijke Bibliotheek • an recognized international leader in digital archiving investigations • fortunately, also our national library • Elsevier was already sending electronic files for its 351 Dutch imprint journals • now expand to the entire 1,800 title journal list, which the KB will archive “forever”

  14. Official archive contract terms • contract is different from a normal license for SD • perpetual nature of an archive • service level agreement • trigger events -- public access • financial terms • format for submission • comprehensiveness of archive (e.g., handling of “withdrawn” material) • as standards for archival repositories develop, KB must meet these

  15. Use of the official archives • available for walk-in users now • available remotely to anyone in the event we exit the business and no one else takes over • in the event of a disaster that would result in ScienceDirect being down for a prolonged period, all libraries holding the journals (archives or SDOS) would be invited to open access to all (no access controls)

  16. “Technical aspects”; LOCKSS principle Hardware • Dayton hosting system is located in a bunker that is Tornado-, Earthquake-, and aircraft impact proof • Daily incremental backups, weekly complete backups • Off-site copies of backups, extensive recovery procedures in place • Migration to new type hardware formats on every new version release

  17. “Technical aspects” – continued Software : all formats are generally accepted standards/formats (developed to last and/or easy to migrate) • Text: full SGML, migrating to XML this year • Older content: “Head & tail” in SGML/XML • Text: PDF (derived from Postscript file) • Older content: laser printer quality (300 dpi scanning) • Images: TIFF, JPEG, GIF (for web applications) • Multi-media files: we support small number of formats that will be usable in coming decades

  18. Thank you !

More Related