320 likes | 448 Views
Government Information Preservation Working Group. Highlights of Digital Preservation Survey December 16 th 2003 Oliver Slattery Information Access Division National Institute of Standards and Technology. Need for Digital Preservation:. ….crucial….critical….essential....important…
E N D
Government Information Preservation Working Group Highlights of Digital Preservation Survey December 16th 2003 Oliver Slattery Information Access Division National Institute of Standards and Technology
Need for Digital Preservation: • ….crucial….critical….essential....important… • Legally required. • Principle role of agency/central to agency mission. • 30-100’s years • Archive distribution and central requirements of data assets. • Important for department to provide secure, accessible, archival information on QC testing and other technical work. • Continuity of operations. • The need to stay current. • Records are ‘permanent’.
Challenges in the next 5 years: • Specific challenges/tasks • Websites (archiving of) • Preservation with online/on demand access • Coordinating/integrating preservation procedures • Migration of current archive • Ensuring authenticity • Other concerns • Management/record keeping • Defining digital preservation • Test capabilities/equipment(procurement – cost and time) • Uniformity among suppliers of digital documents • Same document through every phase of life cycle Obstacles • Large/increasing volumes of data • Multiple formats / format compatibility • Quality/capacity of media • Storage space • Getting customers to use latest media • Upgrading infrastructure/equipment – procurement (cost and time) • Ensuring authenticity
Current strategy and its limitations: • Control Formats – limit: must be done at creation. • Use tapes to store and distribute data – limit: tapes are expensive, will soon no longer be made and are susceptible to errors • DLT, CD/DVD ROM. PDF/TIFF – limit: size, cost, compatibility • Networked computer disk drives and backup magnetic media. Systems include Access databases and laboratory test database called Testream (SQL) – limit: Access portion not secure or traceable. Backup may be insufficient. No assurances of data accessibility if formats change. • Coordinate the preservation of born digital items – Limit: resources • Currently migrating from analog to digital. Still acquire in analog, but send out to customers in digital. Moving towards full digital acquisition. – Limit: storage space and budget. Process is slow. • From archive to CD/DVD for distribution. ‘Deep archive’ facilities for long term storage. – Limit: Large data sets too big for current archive media capacities. • HD media (tapes) such as DLT and SDLT ect., Servers/LAN, some web based access. –Limit: Network throughput is small – nearing limitation. Automation not available for HD preservation work.
Research we want to see: • Reliability • Media durability • Physical testing and artificial aging of digital media to predict durability. • Preservation media. • Testing and evaluation of media. Important to share results. • Large capacity, reliable archive media. • Development of media analysis tools. • Detect changes of error rates in media. • Classical issues such as video archiving, microfilm preservation issues, environmental studies. • Procedures/Best Practices • Methods for migration of legacy information. • Safeguards to ensure authenticity and version control of archived docs • Practices and procedures. Digital is easy to change but hard to detect changes! Information Quality and Access • Authentication • Accuracy of rendering. • Universal media. • One size fits all. • Safeguards to ensure authenticity and version control of archived docs • PDF for archiving • Universal access tool. • Practices and procedures. Digital is easy to change but hard to detect changes! • Standards analysis and development. • New/Alternative Technologies • Fiber channel hard drives • Blue-ray discs • Solid state storage • Universal media. • Keeping an eye on future technology…hardware, software, formats. • Large capacity, reliable archive media. • Formats • PDF for archiving • Preservation media. • Universal access tool. • Preservation format. • Format interconversion. • New/Alternative Technologies • Fiber channel hard drives • Blue-ray discs • Solid state storage • Universal media. • Keeping an eye on future technology…hardware, software, formats. • Large capacity, reliable archive media.
Types of data: • Laboratory results (from equipment) • Records • Graphics/Drawings • Support data • Binary • Binary – seismic • Binary – well logs • Text • Audio • Data files • Microfilm • Multimedia/web • Imagery (Scanned, digital) • Documents (mixed/compound, digital) • Software • Video Bold = multiple hits
Capture and Collection Absolute Maximum = 50 Very important = 5 points Quite important = 4 points Somewhat important = 3 points Not especially important = 2 points Not at all important = 1
Capture and Collection The maximum number of hits per level of importance is 10. The minimum number of hits per importance level is 0.
Capture and Collection The maximum number of hits per level of importance is 10. The minimum number of hits per importance level is 0.
Capture and Collection The maximum number of hits per level of importance is 10. The minimum number of hits per importance level is 0.
Capture and Collection The maximum number of hits per level of importance is 10. The minimum number of hits per importance level is 0.
Capture and Collection The maximum number of hits per level of importance is 10. The minimum number of hits per importance level is 0.
Storage Media Absolute Maximum = 50 Very important = 5 points Quite important = 4 points Somewhat important = 3 points Not especially important = 2 points Not at all important = 1
Storage Media The maximum number of hits per level of importance is 10. The minimum number of hits per importance level is 0.
Storage Media The maximum number of hits per level of importance is 10. The minimum number of hits per importance level is 0.
Storage Media The maximum number of hits per level of importance is 10. The minimum number of hits per importance level is 0.
Storage Media The maximum number of hits per level of importance is 10. The minimum number of hits per importance level is 0.
Storage Media The maximum number of hits per level of importance is 10. The minimum number of hits per importance level is 0.
Storage Media The maximum number of hits per level of importance is 10. The minimum number of hits per importance level is 0.
Storage Media The maximum number of hits per level of importance is 10. The minimum number of hits per importance level is 0.
Data and Storage Management Absolute Maximum = 50 Very important = 5 points Quite important = 4 points Somewhat important = 3 points Not especially important = 2 points Not at all important = 1
Data and Storage Management The maximum number of hits per level of importance is 10. The minimum number of hits per importance level is 0.
Data and Storage Management The maximum number of hits per level of importance is 10. The minimum number of hits per importance level is 0.
Data and Storage Management The maximum number of hits per level of importance is 10. The minimum number of hits per importance level is 0.
Data and Storage Management The maximum number of hits per level of importance is 10. The minimum number of hits per importance level is 0.
Data and Storage Management The maximum number of hits per level of importance is 10. The minimum number of hits per importance level is 0.
Access and Distribution Absolute Maximum = 50 Very important = 5 points Quite important = 4 points Somewhat important = 3 points Not especially important = 2 points Not at all important = 1
Access and Distribution The maximum number of hits per level of importance is 10. The minimum number of hits per importance level is 0.
Access and Distribution The maximum number of hits per level of importance is 10. The minimum number of hits per importance level is 0.
Access and Distribution The maximum number of hits per level of importance is 10. The minimum number of hits per importance level is 0.
Access and Distribution The maximum number of hits per level of importance is 10. The minimum number of hits per importance level is 0.
Thanks Thanks to all who replied. Survey creation: Jerry McFaul, Ollie Slattery, Victor McCrary, Fred Byers , Xiao Tang, Rich Vining.