1 / 39

Issues in Human Rights Web Archiving

Issues in Human Rights Web Archiving. Robert Wolven Columbia University Libraries. Libraries have a mission to build, organize, and preserve coherent collections for research There’s a great deal of human rights-related content on the web Much of it is not currently collected by libraries

emckenna
Download Presentation

Issues in Human Rights Web Archiving

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Issues in Human RightsWeb Archiving Robert Wolven Columbia University Libraries Human Rights Archives and Documentation, CHRDR Conference 4-6 October 2007

  2. Libraries have a mission to build, organize, and preserve coherent collections for research • There’s a great deal of human rights-related content on the web • Much of it is not currently collected by libraries • Something should be done about that

  3. A great deal of content exists only online • There’s a high risk that some will disappear • Libraries and archives are custodians of our cultural heritage • Libraries and archives should lead in preserving “at risk” content

  4. Web Archiving as Preservation • Small footprint in organization • The Hoover effect • Haphazard library collections • Ineffective access

  5. A Lot of Content

  6. Much of it is not … collected • Refugees International • 40 documents on web site • 0 in Columbia collections • 10 listed in OCLC • 1 held by more than 2 libraries • No library holds more than 3

  7. Web Archiving Issues • Ways and Means • Selection policies • Permissions – and Obligations • Organization and Integration • Presentation and Uses • Sharing the Costs and Benefits • Organizational Transformation

  8. Center for Research Libraries’ Political Communications Web Archive Project Project website: http://www.crl.edu/content/PolitWeb.htm Final report: http://www.crl.edu/PDF/PCWAFinalReport.pdf

  9. Web Archiving Tools • Archive-It (Internet Archive) • http://www.archive-it.org • PANDAS (National Library of Australia) • http://pandora.nla.gov.au/pandas.html • OCLC Digital Archive • http://www.oclc.org/digitalarchive/default.htm

  10. “I only want to download text/html and nothing else. Can I do it?” • You can … add a filter that excludes all filters that end in other than 'html|htm', etc., or, if you want to instead look at document mimetypes, you can Add a ContentTypeRegExpFilter filter as a midfetch filter to the http fetcher. • [From Heritrix FAQ http://crawler.archive.org/faq.html#user-heritrix]

  11. Policy   Technology • Full web site or selected content? • Preserve relationships, “look and feel”? • All file types? • How often?

  12. “Corrigendum Changes have been made to the report on "Muslims in the European Union – Discrimination and Islamophobia" after it was printed. Following pages are replacing the Annex page in the EN and FR version of the report. PDF” From the European Union Agency for Fundamental Rights website

  13. Selection by Type of Agency • Governmental • International • Academic • Educational

  14. Selection by Focus • Global, regional, local • Ethnicity, religion, gender, age • Legal, medical, economic • Crisis-driven

  15. Selection by Content • Fixed documents: • Case studies • Position papers • Topical reports • Press releases • Bulletins, newsletters • Activity reports

  16. Selecting by Content • Non-textual (image, sound, video) • Ephemeral, dynamic content • Redundant (?) content: • Languages, formats • Republished or unique?

  17. Rights and Obligations • Permissions: ask or assume • Rights: • Dark archive • Closed archive • Conditional exposure • Obligations: • Parallel (mirror) access • Free, reliable access • Perpetual access

  18. Organization and Integration … or, now that we have it, how do we know what we’ve got? How do other agencies know what’s been done? How do researchers find it?

  19. “From 1 March the European Monitoring Centre on Racism and Xenophobia (EUMC) became the EU Agency for Fundamental Rights (FRA). The content on the website is being gradually transformed to reflect the scope, activities and products of the new Agency.”

  20. Integrating Access • Through Authority Control • Through controlled vocabulary • Through series

  21. Integrating Collections • With print – in the catalog • With archives – in finding aids • With digital collections – in …

  22. Use • Internal organization and navigation • Indexing • Analytical tools • Citation: pedigree and persistent links

  23. Sharing Costs and Benefits • Centralized Collaboration • Distributed Collaboration • Disclosure (at what level of detail) • Exposure (to the web; OAI-PMH)

  24. Transformative Action • Concept of “collecting” • Modes of selection • Bridging communities of practice

  25. “Where do you stop?”

More Related