210 likes | 353 Views
What Agencies Should Know About PDF/A-1. April 6, 2006 Mark Giguere mark.giguere@nara.gov. Introduction. Agenda Why long term preservation of PDF is an issue Overview of PDF/A-1 and the ISO Process Discussion of PDF/A-1 Standard and NARA’s Transfer Guidance for Permanent PDF records
E N D
What Agencies Should Know About PDF/A-1 April 6, 2006 Mark Giguere mark.giguere@nara.gov
Introduction Agenda • Why long term preservation of PDF is an issue • Overview of PDF/A-1 and the ISO Process • Discussion of PDF/A-1 Standard and NARA’s Transfer Guidance for Permanent PDF records • Roles of both PDF/A-1 and the NARA’s PDF Transfer Guidance in Federal recordkeeping • Conclusion and Questions
Long-term preservation of PDF is an issue Wide use of PDF • PDF is a ubiquitous open format for electronic documents • Proprietary, but with publicly available specification • Much important information maintained in PDF • Permanent archival records, in some cases • The feature-rich nature of PDF can complicate preservation efforts
PDF Not a Suitable Archival Format • PDF itself is not suitable as an archival format • Some features not compatible with current archival requirements • Not necessarily self-contained • Encryption • All PDFs are not created equal • Long-term solution needed • Permanent archival records, in some cases • Administrative Office of U.S. Courts initiated idea for an ISO Standard based on PDF (PDF/A)
Overview of PDF/A-1 and the ISO Process • Multi-part ISO International Standard • ISO 19005-1:2005, Document management – Electronic document file format for long-term preservation – Part 1: Use of PDF 1.4 (PDF/A-1) • Part 2 (19005-2) intended to bring PDF/A into conformance with PDF 1.6 • Part 3 (19005-3) intended to address dynamic content (e.g., Java Script) • And additional future parts, as necessary
Specifies requiredfeatures Specifies restricted features PDF 1.4 Reference Specifies prohibitedfeatures PDF/A PDF/A-1 Approach • PDF/A-1 specifies: • The subset of PDF components, from the PDF 1.4 Reference), that are either required, restricted, or prohibited, and • How these components may be used by software
PDF/A-1 Requirements • Disallows or limits features that could complicate long term preservation, and • Maximizes: • Device independence • Can be reliably and consistently rendered without regard to the hardware/software platform • Self-contained • Contains all resources necessary for rendering • Self-documenting • Contains its own description • Transparency • Amenable to direct analysis with basic tools
1 Scope 2 Normative References 3 Terms and Definitions 4 Notation 5 Conformance Levels 6 Technical Requirements 6.1 File Structure 6.2 Graphics 6.3 Fonts 6.4 Transparency 6.5 Annotations 6.6 Actions 6.7 Metadata 6.8 Logical Structure 6.9 Interactive Forms Informative annexes Annex A - PDF/A-1 Conformance Summary Annex B - Best Practices for PDF/A Bibliography PDF/A-1 Table of Contents
Two Conformance Levels • Level A - Promotes the creation of PDF/A files with rich semantic and structural information, • Uses “Tagged PDF” and Unicode character maps • Level B - Allows less complex files such as scanned images. • Includes all requirements of 19005-1 minimally necessary to preserve the visual appearance • Does not require users to define structure or other descriptive information.
Annexes of the Draft PDF/A Standard • Informative Annexes provide supplemental information including: • Summary of the PDF structures and components disallowed, required, or limited • Best Practices for PDF/A-1 • Guidelines for capturing or converting electronic documents to PDF/A-1 • To replicates the exact quality and content of source documents • Required for compliance with NARA’s PDF Transfer Guidance
PDF/A-1 Dos: • Embed fonts • Device-independent color • XMP metadata, • Tagging PDF/A-1 Dos and Don’ts PDF/A-1 Don’ts: • Encryption • LZW Compression • Embedded files • External content references • Transparency • Multi-media • JavaScript
NARA’s Expectations for PDF/A • PDF/A-1 should address some of the PDF archival issues and enable PDF records to be maintained longer as PDF • Standard maintained by ISO, not just vendors • Agencies should implement PDF/A-1 along with records management policies and procedures • Such as…. • NARA’s PDF Transfer Guidance • AOUSC’s document management program
How NARA is Addressing PDF • Issued PDF Transfer Guidance • Allowing agencies to transfer permanent records to NARA in PDF In March of 2003, NARA • Participating in PDF/A ISO Standard Development • To influence the process • To gain knowledge
Transfer Format versus File Format NARA’s transfer guidance and PDF/A-1 have a similar goal …..to ensure that valuable electronic information in PDF is not lost. But different purposes: • Transfer Format - NARA’s PDF Transfer Guidance • Specifies NARA transfer requirements • Applies to existing and future records in PDF • File Format - The PDF/A ISO Standard (PDF/A-1) • Specifies a subset of the PDF file format • More format reliability/fewer in “bells & whistles” • PDF should be maintained longer as PDF (e.g., within agencies)
Scope and Usage NARA’s PDF Transfer Guidance • Usage: Instructions on what is required to transfer existing permanent PDF records to NARA. • Scope • Applies to permanent records • PDF 1.0 - 1.4 • Addresses quality criteria, laws and regulations, transfer documentation, NARA contact information PDF/A-1 ISO Standard • Usage: Programming specification to create and process the file format • Scope • Applies to one aspect of long term preservation (i.e., file format) • PDF 1.4 • Addresses how to use the PDF 1.4 reference to create and process a flavor of PDF that is more amenable to long term preservation. • Should be used as one piece of the archival puzzle
Requirements - PDF/A and NARA’s PDF Transfer Guidance Embedded fonts • PDF/A-1 and NARA’s PDF Transfer Guidance both require that fonts be embedded • NARA guidance phases in requirements for workstation resident fonts. Encryption • PDF/A-1 and NARA’s PDF Transfer Guidance both prohibit encryption • NARA guidance phases in requirement as long as we can open, view and print
Requirements - PDF/A and NARA’s PDF Transfer Guidance Special Features • PDF/A-1 restricts special features • Embedded files, external links, Java Script • PDF/A-1 promotes tagged PDF as a higher level of conformance • NARA evaluates special features on a case-by-case basis at the time of scheduling Metadata/Documentation • PDF/A requires that embedded metadata must be in Adobe XMP • NARA requires transfer documentation (e.g., SF-258), and would evaluate embedded metadata at the time of scheduling
Requirements - PDF/A and NARA’s PDF Transfer Guidance Quality Requirements • PDF/A-1 as a file format does not address quality/creation requirements such as exact replication of source material • Informative Annex B - identifies recommended creation guidelines • Agencies must implement these guidelines to comply with NARA’s PDF transfer guidance • NARA’s PDF Transfer Guidance includes • quality requirements regarding scanning quality, • lossy compression • substitution of characters with OCR’d text
Take Away • For records in PDF, agencies need to understand that: • PDF/A-1 is one option for long-term preservation of electronic documents • PDF/A-1, by itself, does not guarantee exact replication of source material • Agencies must implement PDF/A-1 in conjunction with additional requirements to meet NARA standards for transferring permanent records to NARA (i.e., NARA’s PDF Transfer Guidance)
More Information is Available • More information on NARA’s PDF Transfer Guidance on NARA’s Web Site • http://www.archives.gov/records-mgmt/initiatives/pdf-records.html • More information on PDF/A on AIIM Web Site • http://www.aiim.org/standards.asp?ID=25013 • Contact Susan Sullivan at susan.sullivan@nara.gov