150 likes | 318 Views
Archival Perspectives on ‘Web Archiving’. Maureen Pennock Digital Curation Centre UKOLN, University of Bath. Overview. Introduction to archival records and web archiving Authenticity in archived web resources Sample policies and guidelines Fundamentals Questions. Introduction.
E N D
Archival Perspectives on ‘Web Archiving’ Maureen Pennock Digital Curation Centre UKOLN, University of Bath
Overview • Introduction to archival records and web archiving • Authenticity in archived web resources • Sample policies and guidelines • Fundamentals • Questions
Introduction • Web Archiving or Archiving the Web is now a major and global area of activity • Archiving and managing digital records is also a major area of activity • Records often posted on web sites and intranets: these may also need archiving • BUT … there is a difference in approaches between archiving websites and archiving web-based records
What is a Record? • BS ISO 15489 definition: “any information that is created, received and maintained as evidence and information by an organisation or person in pursuance of legal obligations or in the transaction of business” • Evidence of a transaction • Anything that: • documents a working transaction between two or more parties • documents the mission and goals of an organisation • was created or received in the course of carrying out the mission and goals of an organisation
What is an Archive? • Archives: • “…documents, irrespective of form, medium or age, intended for long-term preservation because of their continuing value.” (BS 5454 - 2000) • Web Archiving: • Collections of websites or website content that may or may not be intended for long-term preservation • Commonly (but not always) carried out by libraries • ‘Archive’ and ‘Archiving’ can mean different things to different people
Archival Records & Web Resources • Web sites can contain uniquely available informative records • Users may act or take decisions based on this information, with important consequences • Records of business transactions • Accountability & transparency • To funding bodies • To stakeholders • For legal reasons • Historical and culturally valuable • Not all web site content is a unique record • Records must be identified and selected – collaborative task
Types of records available: Policies Advice Guidelines Procedures Organisational information Publications? …. Records can also be created over the internet, using transactional systems such as web-based forms. Examples of web site records • Can be in a wide array of formats, eg • Static web pages • Uploaded files • Databases • Accompanying metadata • Documenting resource • Documenting actions • Documenting changes • Documenting publication details…
Authenticity (1) • That a record is what it purports to be and is complete is all essential respects • Not necessarily a given; proof may be required • Can be difficult to ascertain in digital records • Concept of ‘original’ record has lost meaning • Context and provenance not easy to identify • Records often freely created and managed • Preservation activity leads to changes in record • Is this acceptable? To what extent?
Authenticity (2)* • What this means in practice: • Must be demonstrably reliable as proof • Creation and capture • Metadata and context • Ownership/responsibility • Version control • Cataloguing standards • Records management approach goes some way to addressing these concerns * Lizzie Richmond, IWMW 2006
Archiving web resources/sites • Two main models • Harvesting model • Used by national and research libraries, university special collections (e.g., DACHS) and the Internet Archive • Records management model • Addresses the issues raised earlier • May be more appropriate for specific institutional records • Sample guidelines and policies available
The National Archives (UK) • Managing web resources (December 2001) • ERM toolkit for government agencies • Practical steps for active records management and sustainability • Useful identification of web-based records • Scenarios • How websites differ from other records • Management control mechanisms • Model action plan (incl. risk assessment) • Sustainability • Website: http://www.nationalarchives.gov.uk/
U.S. National Archives (NARA) • Guidance on Managing Web Records (Jan 2005) • Provides an initial, high-level framework to manage both content records on agency web sites and records documenting web site operations • Four main sections • General background, responsibilities and requirements • Managing web records – step by step guide and risks • Scheduling web records • Appendices • Website: http://www.archives.gov/
National Archives of Australia • A Policy for keeping records of web-based activity (January 2001) • Provides clear directions to Commonwealth agencies to implement mechanisms for creating, managing and retaining web-based records of value • Guidelines (March 2001) • Challenges and responsibilities • Types of web-based resources • Fundamentals of good record-keeping • Assessing risk – factors to consider • Strategic & technical options • Storage & preservation - issues & strategies • Determining the best option
Managing web-based records • Fundamentals: • Information Audit and Risk Assessment • A systematic approach • Develop policy • Formulate plan for capture, maintenance, and preservation • Implement appropriate website maintenance procedures • Assign and document responsibilities • Identify records • Determine retention requirements • Capture records into recordkeeping system • Add metadata • Transfer content and metadata into archive as appropriate * Based on NAA Guidelines for Archiving Web Resources
Thank YouQuestions? Maureen Pennockm.pennock@dcc.ac.ukJoin the DCC Associates Network at http://www.dcc.ac.uk (it’s free!)