1 / 35

PASIG, May 24, 2013 Carl Fleischhauer cfle@loc Steve Puglia spug@loc Library of Congress

Federal Digitization Moving to Common Guidelines The U.S. Federal Agencies Digitization Guidelines Initiative (FADGI) http://www.digitizationguidelines.gov/. PASIG, May 24, 2013 Carl Fleischhauer cfle@loc.gov Steve Puglia spug@loc.gov Library of Congress Washington, DC.

Download Presentation

PASIG, May 24, 2013 Carl Fleischhauer cfle@loc Steve Puglia spug@loc Library of Congress

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Federal DigitizationMoving to Common GuidelinesThe U.S. Federal Agencies Digitization Guidelines Initiative (FADGI)http://www.digitizationguidelines.gov/ PASIG, May 24, 2013 Carl Fleischhauer cfle@loc.gov Steve Puglia spug@loc.gov Library of Congress Washington, DC

  2. http://www.digitizationguidelines.gov/

  3. 18 Participating Agencies http://www.digitizationguidelines.gov/participants/ 2 Often participating, not “official”: NASA, NOAA, National Museum of Health and Medicine (U.S. Army), U.S. Supreme Court

  4. http://www.digitizationguidelines.gov/stillimages/

  5. http://www.digitizationguidelines.gov/audio-visual/

  6. Guidelines • Conceptual framework documents • Content Categories & Digitization Objectives (still image reproduction; September 3, 2009) • Digitization Activities – Project Planning (November 4, 2009) • Capture device performance • Digital Imaging Framework (high level about scanner performance metrics; April 2, 2009) • Audio Analog-to-Digital Converter Performance (August 20, 2012) • Audio Interstitial Errors (about unwanted dropouts or sample distortion; work in progress, 2012-13) • Broad practices guidelines • Technical Guidelines for the Still Image Digitization of Cultural Heritage Materials (Many segments from 2004 NARA document; FADGI update, August 24, 2010)

  7. Guidelines • Metadata including embedded data and file headers • TIFF Image Header Metadata (February 10, 2009) • Minimal Descriptive Embedded Metadata in Digital Still Images (Smithsonian document embraced by group; March 23, 2012) • Embedding Metadata in Broadcast WAVE Files, Version 2 (April 23, 2012) • Associated tool on SourceForge: BWF MetaEdit • NARA reVTMD video technical metadata (February 2012; FADGI supporting role) • Associated tool on GitHub: AVI MetaEdit • Format analysis and guidelines • File Format Comparisons (comparing still image and video formats; under development in 2013) • MXF Preservation Video Formatting Application Specification (under development during 2013 in cooperation with AMWA trade group; versions posted in 2010 and 2012)

  8. Still Image Illustrative Example Odds and ends about still images

  9. Still image specifications – this is what we all “used to do”• color/monochromatic• pixel density (good old “dpi”)• bit depth• . . . usually output-referred We want to move toward more, um, “scientific” specifications

  10. From this document: http://www.digitizationguidelines.gov/guidelines/DIFfinal.pdf

  11. Resolution rethink: new terms, scanner performance • SAMPLING RATE • SPATIAL RESOLUTION • Spatial Frequency Response (SFR) • SAMPLING EFFICIENCY Thanks to Barry Wheeler for his very helpful Signal blogs: http://blogs.loc.gov/digitalpreservation/2012/12/what-resolution-should-i-use-part-1/ http://blogs.loc.gov/digitalpreservation/2013/01/what-resolution-should-i-use-part-2/ http://blogs.loc.gov/digitalpreservation/2013/03/what-resolution-should-i-use-part-3/

  12. Resolution rethink: new terms, scanner performance • SAMPLING RATE. Usually, the scanner’s ppi number is samplingrate • Sensors can only attempt to measure (sample) the brightness at each point. • Some light may scatter and miss the sensor, the scanner’s motor step may not be sufficiently precise, or the collected value may be inaccurate.  Inside every scanner or camera, between the sensor and the screen is a small, highly specialized computer called a digital signal processor.  This processor must work very hard to link a dot on the page to a dot on the screen. • RESOLUTION. ISO standards (e.g., ISO 12233) define resolution in terms of Spatial Frequency Response (SFR) -- the actual result on the screen. • SAMPLING EFFICIENCY. . . . the difference between the pixel count and actually resolving each point, expressed as percentage.

  13. From the revised guideline http://www.digitizationguidelines.gov/guidelines/FADGI_Still_Image-Tech_Guidelines_2010-08-24.pdf

  14. Tools to Support Image Performance Measurement • Digital Image Conformance Evaluation (DICE) System • Device Target – Imaging Device Performance • Object Target – Actual Image Quality • Software for Evaluation/Validation • Based in LabVIEW • Data export for use in SQC/SPC

  15. Device and Object Targets Object target as positioned for use

  16. DICE Software – Main Panel

  17. DICE – QC Summary Panel Slide from old version of software

  18. DICE – OECF detail page

  19. DICE – SFR detail page

  20. Audio-VisualIllustrative Example MXF format specification for reformatted video

  21. Library of CongressPackard Campus, Culpeper National Archives, College Park Smithsonian Institution Archives

  22. SAMMA from Front Porch Digital

  23. Implementations • SAMMA at LC: Lossless compressed • Each frame is a JPEG 2000 image • Lossless (reversible) transform • Emergent variants • NARA and other archives prefer uncompressed video • Other devices come on the market, e.g., from OpenCube (Belgium), Amberfin (UK), Cube-Tec (Germany), and others in process (e.g., Archimedia)

  24. Standards-based format elements from SMPTE and ISO/IEC • MXF (SMPTE ST 377 and many more) • Standard definition uncompressed covered in ST 377 and also SMPTE ST 384 • JPEG 2000 encoding (ISO/IEC 15444-1) • JPEG 2000 mapped to MXF (SMPTE ST 422) • Other standards also play a role, most from SMPTE, some from EBU

  25. Loose Ends • MXF, JPEG 2000, and even “uncompressed” video are complex standards • Entities that “conform” to the standards can be formatted in various ways • We have some elements that we want to include in order to produce an “authentic copy” • MXF “carriage” can be tricky to sort out

  26. MXF Application Specification • An MXF AS is what some would call a profile • Pin down preferred options, reduce the variables • Support greater interoperability • Increase the comfort level for users • Increase vendor competition • More adoption means better sustainability

  27. Timecode • Source recordings may have multiple timecodes (VITC, LTC, etc.), some on purpose, some by accident, all may provide forensic help for future researchers. • Specify preferred practice for retaining and tagging multiple timecodes in the file

  28. Audio tracks • Source may have multiple tracks • MXF audio track specifications cover “listing” or “allocation” (tagging) and other matters of terminology, need to pin these down

  29. Metadata • Basic tech metadata is not an issue • Needed: specified options for embedding additional technical metadata: • process (like METS digiprov), • about the source item • about quality review outcomes • preservation (like PREMIS), • And some descriptive metadata • Schools of thought: some prefer minimal data (“just and identifier”), others would dump everything they have, specification should permit range of actions – “archivists choice”

  30. Closed captioning, subtitles, ancillary data • US broadcast standards embed CC as binary data • “In the image raster” on line 21 • For digital TV, CC also in packets in MPEG stream • Awkward for future extraction, depends upon availability of decoding tools • Desiderata • Put CC/subtitles in the file for easier access and extraction • XML rather than binary • Alas, MXF offers “too many” options for this, we seek to pin down the best ones • By extension, this also applies to other ancillary data.

  31. An MXF Application Specification is . . . • A formal industry statement • Not a “standard” • Accompanied by a reference implementation and validation tools

  32. MXF Application Specifications come from . . . • Advanced Media Workflow Association (AMWA) • Broadcast-industry group • AMWA Application Specifications include: • AS-10 for production – version for end-to-end digital production workflow (forthcoming) • AS-11 for contribution – the high end version contributed by a producer to a television network (published) • AS-03 for delivery – the reduced-data version “sent to the tower for broadcast” (published) • AS-07 for archiving and preservation will be a sibling to those • http://www.amwa.tv/projects/AMWA_AS_overview 04-2013 web.pdf

  33. Role of AMWA • Key roles played by Turner Broadcasting veterans and engineering staff • Members include AVID, BBC, Front Porch Digital (SAMMA), NARA, PBS, SONY, Discovery Communications, Fox, NBC Universal, and more • http://www.amwa.tv/ • Break into technical committees to push draft specifications

  34. FADGI’s AMWA status • March 2012 • AMWA business committee approval to move ahead • Designate as AS-07 • September 2012 • Technical committee approval • November 2012 • Team meetings began • Early 2013 • Churning along • End of 2013 • Dream of a first draft or better

  35. http://www.digitizationguidelines.gov/ Carl Fleischhauer cfle@loc.gov

More Related