370 likes | 538 Views
Digital Continuity: Tips for Managing e-Legacy Records. Stephen Clarke Senior Advisor Digital Sustainability Programme. “Vindictive” Data Loss. Gareth Pert, 23, nearly crippled Hamilton business Progressive Hydraulics while acting out of "pure vindictiveness", by deliberately wiping data
E N D
Digital Continuity: Tips for Managing e-Legacy Records Stephen Clarke Senior Advisor Digital Sustainability Programme
“Vindictive” Data Loss Gareth Pert, 23, nearly crippled Hamilton business Progressive Hydraulics while acting out of "pure vindictiveness", by deliberately wiping data Files containing information about international patents, crucial project data and five years' worth of engineering drawings were affected Police said the data deleted was worth more than $150,000 but the true cost is incalculable because of delayed or lost projects and time spent on recovery. Computer forensics specialists could recover only 40% of the data lost.
The Company CIO Said… "Electronic data is actually worth a lot more money than you think. It's not until you lose it that you realise what a key component it is in your business."
That was deliberate but benign data loss is actually a greater threat Media Failure. All storage media must be expected to degrade with time, causing irrecoverable bit errors, and sudden catastrophic irrecoverable loss of data Hardware Failure. All hardware components must be expected to suffer transient recoverable failures, such as power loss, and catastrophic irrecoverable failures. Software Failure. All software components must be expected to suffer from bugs that pose a risk to the stored data. Communication Errors. Systems may have undetected checksum errors Failure of Network Services. Domain names and persistent URLs will suffer both transient and irrecoverable failures
Data corruption (bit-rot) Only one bit of a Byte is corrupted in this image!
That was deliberate but benign data loss is actually a greater threat (cont.) Media & Hardware Obsolescence. All media and hardware components will eventually fail Software Obsolescence. Similarly, software components will become obsolete. This will often be manifested as format obsolescence when, information can no longer be decoded from the storage format into a legible form. Operator Error. Operator actions must be expected to include both recoverable and irrecoverable errors. Natural Disaster. Natural disasters, such as flood, fire and earthquake must be anticipated.
Hardware Obsolescence Hardware has a limited life span
Software Platform Obsolescence Assuming you have all the right hardware and storage you then need the right software and operating system to interpret the data and render it as supposed to look. Application software Operating System Display
StorageFormats Obsolescence • Storage media has a limited life span
That was deliberate but benign data loss is actually a greater threat (cont.) External Attack. All systems connected to public networks are vulnerable to viruses and worms. Internal Attack. Much abuse of computer systems involves insiders, those who have or used to have authorized access to the system. Economic Failure. Information in digital form is much more vulnerable to interruptions in the money supply than information on paper. Organizational Failure. Organisations may not plan and provide sufficient resources for ensuring their digital assets are protected. http://www.dlib.org/dlib/november05/rosenthal/11rosenthal.html
There are a wide variety of e-legacy records Email SMS/Text messages Databases GIS Textual records Audiovisual recordings Pictures and images (scanned docs) Intranets and shared workspaces As well as…
Sheer volume The volume of digital information being created is increasing exponentially. In 2008 the digital content created exceeded storage capacity for the first time. By 2011, the volume of digital content will be 10 times the size it was in 2006. By 2011, almost half of all information created will not have a permanent home.
e-Legacy records issues - Technological Hardware / media obsolescence Operating system obsolescence Software application obsolescence Storage media obsolescence
e-Legacy Records Issues - Organisational Proprietary formats and DRM can impact on your ability to access information New IT implementations often don’t take account of existing systems, information gets orphaned Benign neglect is commonplace Lack of controlling indexes or context Idiosyncratic titling and folder structures Lack of organisational awareness and willingness
AAArgggghhhh!!!! Where do I start?
Where do I start? Identify what you have Make an inventory of formats or software environments you use Prioritise ‘at risk’ information Migrate where there are ‘quick wins’ e.g. from older versions of Microsoft Office products, ppt, Word, Excel, etc. Raise awareness and get senior management support Draft organisational or departmental policies Does the material need to be retained can I dispose?
Make friends with your IT people Courtesy National Archives of Australia
Make friends with dept. secretaries and PA’s They know where everything is!
Steps to managing e-legacy records Identify the creators of the records contained in the legacy system Identify the physical format Determine the software format Identify the context of the records’ use where possible Appraisal to apply, disposal and sentencing, migration strategies and risk analysis Convert to open formats
Identifying creators Implement an institutional knowledge management programme to find out about: Organisational administrative history Individuals names, roles and positions Project working groups Previous mergers or amalgamations New functions or functions no longer carried out What all those %$#@#+# acronyms mean!
Tools that are available to help with identifying file formats include: PRONOM Droid JHOVE National Software Reference Library Wotsit
Open a hardware museum? Find out what hardware you have in-house 8” Drives, 5 ¼” drives, cartridge players etc. Find out what software you have in-house Earlier versions of windows, Photoshop, in-house developed software, proprietary systems, etc.
Risk evaluation (and tools) Risk associated with records’ formats, with context and authenticity The AS/NZS 4360:1999 Standard on Risk Management DRAMBORA - Digital Repository Audit Method Based on Risk Assessment Trustworthy Repositories Audit & Certification (TRAC): Criteria and Check-list NESTOR - Network of Expertise in Long-Term Storage of Digital Resources
Undertake a review (records audit / survey) What is the Business Value? Are there Compliance or legal hold considerations? Financial implications litigation unnecessary storage costs fraud loss of contracts or agreements accounts payable/receivable errors and/or omissions
Digital Preservation Tactics Normalisation Migration (AKA conversion or technology refresh) Emulation Encapsulation
Open Source Tools Fedora – digital archive D-Space – digital archive DROID – format recognition JHOVE – format recognition SIARD – database archiving XENA – normalisation www.sourceforge.net
OpenFormatExamples ODF - OpenDocument Format XML – eXtensible Markup Language HTML – Hypertext Markup Language PNG – Portable Network Graphics FLAC – Free Lossless Audio Codec There are emerging and de facto standards e.g. PDF(A), OOXML, ODF, JPEG 2000, TIFF, etc.
It’s not just a technology issue • Survey staff on what older e-records they have and encourage them to self migrate • Use institutional knowledge and find out what systems have been used and where old equipment is • Engagement is higher when staff feel involved (de-mystify) • Implement policies and procedures so that obsolescence will be managed in future
Is addressing digital continuity difficult or expensive? Digital continuity actions required are incremental, and needn’t cost a lot of money (e.g. setting policies, procedures and migration plans) Recognise that digital continuity is a risk, carry out a risk assessment and prioritise mitigating actions (focus resources on the important stuff) More effectively managing the information you need to keep and disposing of what you no longer need should cut costs and could also deliver efficiency benefits and savings Tackling risks now and as part of a planned response is often more cost effective than waiting until technology risks occur (data recovery is expensive and not always possible).
AnyGoodNews? File formats are becoming obsolete slower than thought (the market is stabilising) Storage costs are coming down Established practice is emerging (you can reuse) Trusted Standards are being Adopted (XML, ODF, PDF-A, JPEG-2000, etc.) You’re not alone! There is a growing international community of practice We’re providing guidance and support to help you, and The Digital Continuity Action Plan provides a public sector wide framework to facilitate collaborative approaches.
Any Questions? Courtesy National Archives of Australia