160 likes | 290 Views
Collection Development and Web Publications at the British Library. John Tuck Head of British Collections Digital Memory, Session 2, Tallinn 24 th November 2005. Three strands to the Programme an underpinning collection development policy
E N D
Collection Development and Web Publications at the British Library John TuckHead of British CollectionsDigital Memory, Session 2, Tallinn24th November 2005
Three strands to the Programme an underpinning collection development policy UK collaborative approach through the UK Web Archiving Consortium (UKWAC) International collaboration through the International Internet Preservation Consortium (IIPC) British Library Web Archiving Programme
Short life-span Looking ahead to extension of legal deposit to non-print Pilot project Domain.uk as proof of concept British Library Web Archiving Programme:Why?
Team is divided across Scholarship and Collections and IT directorates Web Archiving Programme Manager Curator, Web Archiving whose responsibility includes definition of the collection development policy Other posts focussing on technical aspects/developments and on permissions, rights clearance and administration British Library Web Archiving Programme:Resources
Web Archiving High Level Collection Development Policy Given the huge scale and dynamic nature of the web (estimated at approx 5 million UK-based web sites) the British Library does not consider it practicable nor affordable to aim at truly comprehensive coverage of the UK web presence. The Library’s strategy is based on: a) taking a complete snapshot of the entire UK web presence at regular intervals (possibly annually or twice a year) b) achieving a more intensive and selective harvesting of a limited number and well-defined range of sites, building up over time to perhaps 10,000. These would be sites judged to be of research value now and in the future, reflecting the national and cultural heritage, and including a number of sites which are exemplars of web innovation. Also included is an events-based, thematic collection strand Collection Development Policy
Through its Curator, Web Archiving, the British Library has defined a more detailed development policy statement for UK web sites (See www.bl.uk/collections/britirish/modbritcdpwebsites.doc) Framework of curators within the British Library to assist the Curator, web archiving. Work also carried out with partners, e.g. within the UKWAC consortium Longer-term aim is to consider web-sites as just another format to collect within an overall collection development policy Web Archiving Collection Development Progress
Officially launched in June 2004 Comprises six institutions: British Library (lead partner), Joint Information Systems Committee, National Library of Scotland, National Library of Wales, the National Archives, and the Wellcome Trust Two-year pilot project with aims of putting in place common framework, common approaches to rights-cleared web archiving, and to put in place an archive of websites (see www.webarchive.org.uk) To date has archived over 700 sites. British Library input has been 700 instances of 282 sites The first successful selective archive of UK web space which imposes no charge for including material or for access. Based on the National Library of Australia’s web archiving application `PANDORA’ UK Web Archiving Consortium (UKWAC)
From outset it has been the intention to seek explicit rights clearance from website owners, pending secondary legislation for the deposit of UK websites Common licence/template devised by UKWAC Sites only mounted once explicit permission has been agreed Some exceptions in case of events-based collection, e.g. Asian tsunami, UK general election 2005, and London bombings, July 2005: notice and takedown policy put in place UKWAC: Permissions-based approach (1)
British Library has sent out more than 1,500 permission requests; has received only approximately 400 positive replies. 25% success rate. Very few outright rejections (10) but many queries (200) and no replies Not sustainable: impact both on collection size but also collection balance Secondary legislation through the Legal Deposit Libraries Act will address this. May be the case that web sites will be brought up the agenda with a swifter schedule for implementation than originally thought UKWAC: Permissions-based approach (2)
Website: http://netpreserve.org Mission: To acquire, preserve and make accessible knowledge and information from the Internet for future generations everywhere, promoting global exchange and international relations Goals: To enable the collection of a rich body of Internet content from around the world to be preserved in a way that it can be archived, secured and accessed over time To foster the development and use of common tools, techniques and standards that enable the creation of international archives To encourage and support national libraries everywhere to address Internet archiving and preservation International Internet Preservation Consortium (IIPC)
Aim of IIPC is to put in place a range of tools and common standards for those tasked with web archiving We see IIPC and developing tools, standards as the means of achieving a whole domain crawl of the UK Recently took part in a smart crawler project and procurement with the Bibliotheque nationale de France to put in place a prototype to enable large scale web archiving; automatically locating content, frequency of capture and thematic linking. Complexities of the technology have led to a new approach now to involve British Library, BnF and Library of Congress National Library of New Zealand, Library of Congress and British Library also to work on improved curator tools to facilitate interface and work of curators dealing with websites Working with IIPC
Offline digital publications The British Library will seek to collect offline resources (e.g. CD-ROMs, Disks, DVDs [not films]) comprehensively to the level of approximately 80% - 90% of estimated published output. Collection will be within a scope generally defined as appropriate for current research or research in the future e-Content: Future Collection Development for other e-Formats
Online e-journals The British Library will seek to collect e-journals with a UK imprint comprehensively to the level of approximately 80% of published output, and within a scope generally defined as appropriate for current research or for research in the future. The 20% of material not collected will reflect out of scope material considered to be of non-research level together with a small element of inevitable non compliance e-Content: Future Collection Development for e-Formats
Online e-books The same collection criteria as for e-journals apply to e-books but we believe that the build-up to 80% will be slower than for e-journals as to a large degree e-books currently replicate printed materials and very few are at research level. E-books are not prioritised by the legal deposit libraries in the UK as an area of early Regulation under the Legal Deposit Libraries Act 2003 e-Content: Future Collection Development for e-Formats
Databases In the case of databases, many may not be defined as publications under the Act and thus would not be eligible for legal deposit. For formally published databases, the British Library will seek to acquire comprehensively and within the same scope and proportions as for e-journals. Note is taken, however, of the dynamic and ephemeral nature of databases and the technical challenges they will present. From the perspective of the national published archive, databases can probably only meaningfully be collected on a snapshot or last edition basis. At present online databases are being accorded a lowish priority for the Library from the perspective of both voluntary and statutory deposit. Many are more likely to be relevant to the web archiving programme e-Content: Future Collection Development for e-Formats
Handheld (CD-ROMs etc): declining in number; delivered to our Legal Deposit Office; processed as other physical materials. Fully catalogued and accessible in reading rooms On-line materials received through voluntary deposit: new set of procedures, workflows put in place; clear collection development policy defined enabling selection; multitude of file extensions; on-line material stored as e-mails in first instance, then burned on DVD for storage (using Ex Libraries Digitool) Long-term objective is incorporation in Digital Object Management Programme, as part of overall digital preservation strategy Voluntary Deposit of Electronic Publications:Practice