1 / 19

Digital Libraries and e-Archiving at CERN Challenges and Solutions for the Scientific Community

Explore the challenges and solutions for managing scientific data in the digital age at CERN. Discover the significance of digital libraries, open access publishing, archival preservation, and more in scholarly communication. Learn about data retrieval, indexing, and the transition from print to digital formats. Join the discussion on the future of information dissemination and collaboration in the scientific community.

sdebbie
Download Presentation

Digital Libraries and e-Archiving at CERN Challenges and Solutions for the Scientific Community

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Digital Libraries and e-Archiving at CERNChallenges and Solutions for the Scientific Community “First” 28th September 2006 Tim Smith CERN/IT

  2. Why Such A Hot Topic? • Software: ... • National repositories: ... • National strategies: ... • International initiatives: The European Library ... • Conferences: ECDL, iPres, ... • Industry: Google Scholar / Book • WWW + Google + Internet archive • Not enough? • Data ≠ Information ≠ Knowledge

  3. Scholarly Communication Publisher Copy editing Consistency Conventions Refereeing Publication Dissemination Library Subscription Collection mgmt Classification Cataloguing Indexing Reference retrieval Archival Search Access Reader Library/Journal Subscription Communities Find Author Manuscript preparation Digital Library WWW

  4. Digital Library Services Aggregation Collection Conversion > 100 sources Expose CERN authored material Organisation Enrichment Stamping Watermarking Indexing Ranking Clustering Classifying

  5. Open Access • Scholarly publication ≠ trade publication • Signatory of Berlin Declaration • Author grants • free, irrevocable, worldwide, perpetual right of access, … • Store in repository • Unrestricted distribution, interoperability, long-term archiving, …

  6. Digital Age Services • Thus far, changed form not function • Reproduced paper chain • Take advantage of native digital services • Collaboration • Comments, reviews, baskets • Immediacy • Email alerts, RSS feeds • Intensive tasks • Keyword & citation extraction • Full text indexing & ranking • Conversion services: multiple download formats • Flexible formats • Remove constraints of print versions • Internationalisation

  7. Internationalisation

  8. Connections and Statistics

  9. Reviews and Comments

  10. Key Word Extraction

  11. Digital Age Processes • Thus far, same actors and processes • Print medium was difficult to produce, distribute, archive, duplicate • Not so for electronic media ! • Publishers role: certification and dissemination • How to get in (digital world) • Authority, Authenticity, Quality • Exploring new forms of peer review • Open Access publishing: CERN initiative • Author-pay model • Break the vicious circle: Tenure / grant allocation

  12. Advocacy and Coverage • Legal deposit • Natural focal point: everything passed through publisher/printer • Encouraging / promoting deposit • CERN publishing policy – deposit in eArchive • Harvesting • CDS missing submissions • Theoretical papers: close to 100% • Experimental papers: average, about 70% • Instrumentation papers: only 30%

  13. Digital Age Content • Multimedia • CPU intensive services: web download format preparation from masters • Data behind the publication • Experimental data sets • Log books • Institutional information • Multimedia records of the experiment life-cycle • Financial, social etc • Dissemination of unfinished, unrefereed work

  14. Video Archives EGEE Interview: Bob Jones 0120kbps (2439 kb), 0480kbps (9814 kb), 1000kbps (20702 kb) 2000kbps (40092 kb), Multirate120 1000kbps (32977 kb)

  15. CDS Content and Usage 70% non-CERN

  16. Not “born-digital” • Multimedia archive project • Meta data: key to retrieval • Photo-caption project (retirees) VHS 1980s Beta SP 1980s Open reel Audio 1950s U-matic 1970s

  17. Digitisation for Preservation • Deposit in Digital Library • Improve access • Halt deterioration of objects • Archiving of knowledge to preserve perennial access • Institutional archives • Subject Archives • Digital preservation needs • Strategies • Certification • Networks of backups • Storage model

  18. Perpetual Access • Active curation • Used to be largely passive until conservation work required • Technology obsolescence • Not always possible to create exact digital copy or replicate appearance • Changing media or file format • Need to verify integrity, authenticity, reliability • Audit trails and check sums: to eliminate transcription errors (or deliberate) • Associated metadata • Digital object and meta data encapsulation: ISO14721 OAIS • Multiple copies for security • Across different administrations: Los Alamos declass reps • LOCKSS and CLOCKSS

  19. Outlook • CERN is implementing solutions to manage 100s of PBs of LHC data • CERN’s knowledge is being amassed in a Digital Library which is “safe on a 10yr timescale” • DB migration, redundancy, backups • Long term preservation (100yr timescale) is an unsolved problem, but lots of initiatives • Bringing together IT specialists, librarians, archivists, museum curators, (authors) ...

More Related