1 / 44

Danish Legal Deposit on the Internet:

Danish Legal Deposit on the Internet:. Current Solutions and Approaches for the Future ECDL, September 2001 by Birgit N. Henriksen Head of Digitization and Web Department The Royal Library, Denmark bnh@kb.dk. Presentation outline. Three different initiatives:.

cosima
Download Presentation

Danish Legal Deposit on the Internet:

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Danish Legal Deposit on the Internet: Current Solutions and Approaches for the Future ECDL, September 2001 by Birgit N. Henriksen Head of Digitization and Web Department The Royal Library, Denmark bnh@kb.dk

  2. Presentation outline Three different initiatives: • Since 1998 selection based archiving (production) • netarchive.dk (new project, multiple archiving strategies , 2001-2002) • Nordic Web Archive (project 2000-2001, access to web archives)

  3. The Danish Legal Deposit Law • 1697: All printers in royal and ducal lands must deposit • 1703: Only printers in Copenhagen have to deposit • 1781: All printers in royal and ducal lands must deposit • 1902:All printed materials to be deposited • 1927: Posters and some types of ephemera excluded • 1997:All published works to be deposited

  4. The law from 1997 covers any work published in Denmark regardless of medium “work”: a delimited quantity of information which must be considered a final and independent unit “published”: when … copies of the work have been placed on sale or otherwise distributed to the public

  5. Types of Net Publications Static included (only periodically updated) • monographs • periodicals Dynamic excluded (continuously updated) • Databases • homepages

  6. www.pligtaflevering.dk

  7. How do we get the material? • Download based on notification NOT • Harvesting the Danish domain • Delivery of works (a collection of files) from the individual publishers

  8. Registration • WHO the person in charge of the technical completion of the digital copy • HOW by filling out a form at http://www.pligtaflevering.dk

  9. Registration Form - Monographs

  10. Download - workflow The staff at the Danish Department, The Royal Library • determines whether a publication is covered by the law • if yes, downloads all files belonging to the work • checks downloaded work • catalogues and classifies the work in the OPAC (only periodicals) • transfers work to archival server (server mirrored nightly to State and University Library, Århus)

  11. Plug-ins

  12. System Environment

  13. Domain names in .dk domain

  14. Volume in archived material

  15. Monographs vs Periodicals

  16. Public vs. Private Publishers

  17. Staff resources

  18. MimeType Statistics – % of collected files

  19. Three generations using the internet

  20. Brouchers Product databases/portals Organisation websites Newsletters/minuts on websites Online services like krak.dk Net Art The modifications from 1902 • Brochures and advertisements • Catalogues • Election campaign material • Club/organisation magazines • Songs • Scouting magazines, church newsletters • Maps • Portraits • Art prints

  21. Problems related to the notification concept • Lack of notification of multiple representations of a publication • Lack of notification of new versions

  22. Problems related to technical issues • Errors or inconsistencies in the published files • Java applets – no solution at the moment • Found solutions on previous problems: • Documents with java scripts • Data behind forms • Data behind username/password logins • Cookies-based session handling • SSL encryption

  23. Gains if harvesting is used • Better coverage of Denmark outside the public sphere • Updated versions – also for static publications • New trends on the net as soon as they appear

  24. Why not only harvesting? • Programs and plug-ins are difficult to keep track on • Harvesting is not always possible (e.g.. streamed and web casted material) • Harvesting may not give a useful result - technical problems (java, interactive sites) - personalised sites • Harvesting may produce a collection of documents that have never existed on the net • Harvesting may not always give the best format for long-time preservation

  25. Net Art

  26. Home banking

  27. Searching the catalogue

  28. Collections made by harvesting • Are not complete – previous slides • No robot will never be able to make a ’true’ snapshot – the snapshot contains a mix of documents that have never been published together at the same time – a ’fake’

  29. Archive for Danish Literature • www.adl.dk from 1. October 2001 • All full texts are structured in XML on work level • The XML is loaded to a database • The database performs the web publishing in well-formed HTML on a page level What do we prefer to archive and for what purpose?

  30. Birte Christensen-Dalsgaard: Archive Experience, not Data

  31. Web Archiving Conference, CPH June 2001 • Focus: User Expectations to webarchiving in DK • Brought together : • members of the user community, scholars as well as scientits • member from the organisations traditionally in charge of preserving oral and written material • members with technical knowledge • Proceedings (UK) – netarchive.dk

  32. Web Archiving Conference, CPH June 2001 • Sholars & scientist: • Archive the dynamic part of the web • Focus on archiving • the content • the context • the evidence of use • Archivists: • Use different archiving approaches • New methods for archiving dynamic material • Budgets for making snapshots and making selective collections are comparable

  33. Birte Christensen-Dalsgaard: 3 dimensions - duration Published, static • Book-like publications • Scientific Journals • News-sites • Chat Signal lifetime Hourly Update Real time dialog

  34. Birte Christensen-Dalssgard: 3 dimensions - Permanent value Permanent Value Persistent • What is worth preserving? • Quality vs. Representative Transient

  35. Birte Christensen-Dalsgaard: Background - Nature of Information Published, static Permanent Value Persistent Signal lifetime Static Dynamic Interactivity Hourly Update Transient Real time dialog

  36. Birte Christensen-Dalsgaard: Domain of different harvesting methods Published, static Legal Deposit, DK Permanent Value Accumulative harvesting Persistent Snapshot Signal lifetime Static Dynamic Interactivity Hourly Update Transient Real time dialog

  37. Birte Christensen-Dalsgaard: What is missing? Published, static Legal Deposit, DK Permanent Value Accumulative harvesting Persistent Snapshot Signal lifetime Static Dynamic Interactivity Hourly Update Transient Real time dialog

  38. netarchive.dk (1) Test different archival approaches and the subsequent usability of the archived material for research Published, static Permanent Value Snapshot Persistent Signal lifetime Accumulative Process Static Dynamic Interactivity Real time dialog Transient

  39. netarchive.dk (2) • Pilot project testing different archival approaches and the subsequent usability of the archived material for research • Project partners: • State and University Library, Aarhus • Centre for Internet Research • The Royal Library • With economic support from the Danish Electronic Research Library (DEF) • Period: August 2001 – July 2002 • Case: Danish municipal elections November 2001

  40. netarchive.dk (3) • Which materials with • What frequency? • Collection method? • Which software? • How should the collection of materials be organized and how should it be stored? • How should obsolescence of data formats be dealt with? • How should access be given? • Budgets for collecting and storing

  41. netarchive.dk (4) Net material covered by netarchive.dk • net activities from existing news media (newspapers, radio, TV (both national, regional and local media)) • political parties official pages, national and local • individual politicians’ personal pages • official (county) municipal pages • voters’ personal pages • »local themes«- pages • special interest organisations • portals in the broadest sense • opinion polling firms • public emails/ press releases • news groups / usenet • net-conferences and chat

  42. How do we catch the missing part? Process rather than material – ‘Filming’ the net through a browser Goal: Catch chronological series of displayed WebPages Tools to take into consideration: • Business intelligence tools • Tools used in usability laboratories …

  43. Nordic Web Archive (NWA) • Establish a Danish test archive in order to participate in NWA • Software: NEDLIB robot • Status 1/9 2001: • Archiving started 20/8 2001 • 1.9 mio documents • 43 GB uncompressed data

  44. Questions?

More Related