250 likes | 378 Views
Working collaboratively in digital preservation Welcome! We’re just waiting for everyone t o join and then we’ll get started. Working collaboratively in digital preservation Paul Wheatley SPRUCE Project Manager University of Leeds Twitter: @ prwheatley
E N D
Working collaboratively in digital preservation Welcome! We’re just waiting for everyone to join and then we’ll get started...
Working collaboratively in digital preservation Paul Wheatley SPRUCE Project Manager University of Leeds Twitter: @prwheatley http://openplanetsfoundation.org/blogs/paul
House keeping.... • Welcome! • http://bit.ly/OPFcolab • Please mute your microphone • Please open your chat box • If you can’t hear me please let me know!
Overview • Community fail: what has gone wrong in the digital preservation community? • The benefits of collaboration: how working together is good for us all, but most importantly you • Get involved: a sample of the best collaborative initiatives and how you can be part of them • The end result: what we can achieve when we collaborate
Community fail: what has gone wrong in the digital preservation community? Community fail Preservation costing example Tools and user needs example Finding tools example
Community fail • Insufficient communication, awareness, sharing and user driven development • Duplication / Reinvention / Insufficient reuse of existing tools/approaches • Impact: • Real challenges on the ground fail to be solved • Tools not fit for purpose • Best practice? Virtually none existent • Practitioners struggle
Example: Digital preservation costing initiatives • LIFE 1, 2 and 3. Projects to explore digital preservation costing, and develop costing models. • Cost Model for Digital Preservation (CMDP): Project at the Royal Danish Library and the Danish National Archives to develop a new cost model. Currently covers Planning, Migrations and Ingest • Keeping Research Data Safe 1 and 2 (KRDS):Cost model and benefits analysis for preserving research data • Presto Prime cost model for digital storage • Cost Estimation Toolkit (CET): Data centre costing model and toolkit, from NASA Goddard • Cost Model for Small Scale Automated Digital Preservation Archives (Strodl and Rauber) • APARSEN Project activity focused on digital preservation costing • EPRSC and JISC study on Cost analysis of cloud computing for research • Cost forecasting model for new digitization projects (Excel and web tool under development) (KarimBoughida, Martha Whittaker, Linda Colet, Dan Chudnov) • DP4lib business and cost model for a digital preservation service • DANS Costs of Digital Archiving Volume 2 Project, focusing on preservation and dissemination of research datasets • Blue Ribbon Task Force on Sustainable Digital Preservation and Access • Economic Sustainability Reference Model • ENSURE Project - Enabling kNowledge Sustainability Usability and Recovery for Economic value • Cost Model for Electronic Health Records (Bote, Fernandez-Feijoo, and Ruizb) • 4C. EU funded project on costing. Due to commence in 2012. Led by JISC • http://wiki.opf-labs.org/display/CDP/Home • An extended blog-rant on why this typifies a big #fail for our community
Screening the Future: Managing the cost of archiving, master class, 22nd May, California Nordbib: The Costs and Benefits of Keeping Knowledge, 11th June, Copenhagen JCDL2012: Models for Digital Cost Analysis, 14th June, Washington DC It gets worse... Blue Screen of Death Image courtesy of Bill Lefurgy, from the Atlas of Digital Damages
The tools don’t work.... “10 Years on we are still pretty much talking about the same things... ...Tools like DROID and PRONOM etc. didn’t work properly then, and they still don’t work properly now." Steve Knight, New Zealand National Library, iPRES2012 (blogged by Inge Angevarre: How are we doing as a community?)
Mismatch of solutions to user needs • Tools and services that solve digital preservation problems we don’t have • Tools and services that are difficult to use, difficult to integrate with a users setup/workflow/technology • Focus on preservation planning, migration, emulation, file format obsolescence... • In the first instance practitioners need better characterisation: • Appraisal / assessment • Risk identification • Quality assurance
Put the users in the driving seat • Give the users and practitioners more of a voice • Capture and articulate the challenges more effectively • Share, discuss and refine our requirements
Example: Finding digital preservation tools • Where do you look when you need a preservation tool to solve a particular problem?
Too many lists, not enough collaboration • One tool registry • Utilised (and supported) by the big organisations • Anyone can edit it • Links to the code, links to user experiences
The benefits of collaboration: how working together is good for us all, but most importantly you Community Individual Note of caution
Benefits of collaboration to the community • Understanding the problem • Capture the challenges • Share requirements • Focused solutions to the problems we have • Understanding approaches to solving the problem • What works well • What doesn’t work so well • Best practice guidance • Developing a solution • Pool development resource • Solve a shared problem, you have a group of users • Bigger impact, better solutions
Benefits of collaboration (individually) • Learning opportunity • New ways of working • More efficient • Raises your profile • Reassuring and supporting • Fun
Note of caution: the flip side • Overheads, eg. comms and coordination • Removal of all redundancy is probably a bad thing in digital preservation! • Dependence on others is a risk • Choose your partnerships carefully
Get involved: a sample of the best collaborative initiatives and how you can be part of them Atlas of Digital Damages OPF Format Corpus Stack Exchange Mashups
The SPRUCE Mashup Identify and Solve concrete preservation problems • 3 day workshops for ~30 people • Practitioners bring along digital collections • We identify preservation challenges • Pair up practitioners with technical experts • Apply existing open source tools to solve the problems • In doing so, we exchange knowledge about digital preservation • Begin to develop a supportive community Glasgow Mashup April 2012
Mashups: some observations • Almost every Mashup solution from 5 different events utilised existing (none DP) tools • The challenge is finding the right tool to apply • The group’s collective knowledge is very useful here • Most valuable aspect is the conversation and knowledge sharing • Practitioners and developers *can* understand each other! • Agile development offers a number of advantages
The end result: what we can achieve when we collaborate Jpylyzer Datasets, Issues and Solutions File Format November Golden rules
Jpylyzer example • New characterisation + validation tool for JPEG2000 • Produced by SCAPE and OPF • Development and operation driven by use cases • Eg, JP2 used at scale in mass digitisation efforts, truncation is common potential problem, yet existing tools didn’t check (quite complex) end of file conditions • Flawed creation tools omit critical metadata in created files • Validation use case: check JP2 files from 3rd party digitisation suppliers against a profile • A number of organisations had the same needs • Shared example files enabled testing of solution • Focused tool: easy to use, easy to embed • Now part of Goobi! • http://openplanetsfoundation.org/software/jpylyzer
Golden rules for collaboration in digital preservation • Share, share and share again • Open licensing FTW • Think about the best location • Utilise existing infrastructure • Flickr, Github, Stack Exchange, etc... • Before you start a new initiative/development/whatever... • Check there isn’t something you can build on • A document dies when its published... • A wiki has the chance to live on if its of interest to a community • If you’re not on twitter, you’re missing out
In summary Collaboration is: • good for you and for those you collaborate with • quick and easy to get involved in • actually quite good fun! So please get involved: • http://bit.ly/spruce-collaborate
Thanks for listening! Any questions? http://bit.ly/spruce-collaborate Paul Wheatley SPRUCE Project Manager University of Leeds Twitter: @prwheatley Email: p.r.wheatley@leeds.ac.uk http://openplanetsfoundation.org/blogs/paul Cartoon images courtesy of digitalbevaring.dk