440 likes | 602 Views
Getting to Disk-based Lossless Digital Video Preservation – An Introduction. Paul Theerman, Walter Cybulski, Glenn Pearson National Library of Medicine NIH/HHS. Historical Audiovisuals at the National Library of Medicine. Paul Theerman, Ph.D. Head, Images and Archives
E N D
Getting to Disk-based Lossless Digital Video Preservation –An Introduction Paul Theerman, Walter Cybulski, Glenn Pearson National Library of Medicine NIH/HHS
Historical Audiovisuals at the National Library of Medicine Paul Theerman, Ph.D. Head, Images and Archives History of Medicine Division, NLM
Historical Audiovisuals at NLM • Origins as the National Medical Audiovisual Collection • A clearinghouse for these materials • Variously held here and at CDC, Atlanta • Only relatively recently transferred to the History of Medicine Division
Historical Audiovisuals at NLM • Current definition of the collection • All audiovisuals before 1970 • Films and videos of historical interest dating after 1970—that is, of interest for historical value, not informational value
Historical Audiovisuals at NLM • The collection ranges from the first decade of the 20th century through the 1990s • Content: • Early films on “how to go to the doctor”and other public service and public information films • Films on the U.S. Public Health Service
Historical Audiovisuals at NLM • Content • Dental films due to an ADA donation • Training films for surgical procedures • Military: battlefield surgical films • Large recent donations from NIMH and FDA • “home movies” • Research footage • Films promoting usage of films in medicine
Historical Audiovisuals at NLM • Size: the largest such collection in the U.S. • Total number of titles: ~9650 • Number cataloged: 4300 • Number inventoried: 3550 • Number to be inventoried: ~1800 • Number preserved: 2250+
Historical Audiovisuals at NLM • The ability to collect is dependent on the ability to preserve and to catalog, and, in the short run, to stabilize in order to preserve and to catalog in the future • Controlled environments • On-site cool vault for new accessions, masters • Off-site cool and cold vaults for new accessions, originals
Historical Audiovisuals at NLM • The decision to preserve and to catalog is not made lightly, because of the investment of resources • Based on condition and content assessments
Historical Audiovisuals at NLM • Condition assessment • Age of medium • Obsolescence of format • Possible or actual deterioration of medium • Nitrate • Acetate • Generation
Historical Audiovisuals at NLM • Content assessment • Ownership and restrictions • Uniqueness • Age, especially pre-1950 • Then a sliding scale, based on collection development guidelines
Historical Audiovisuals at NLM • When both condition and content indicate, then: • Preservation copying, to three copies (in some cases two) • Cataloging, either to full or core records
Historical Audiovisuals at NLM • Currently we are on the cusp of moving to digital formats, but our originals are chiefly analog, and our duplication and viewing copies are as well • Betacam SP for duplication copies • VHS for use copies • This also matches patron needs for Interlibrary Loan and production
Historical Audiovisuals at NLM • The Preservation and Collection Management Section enters the picture: • Determining formats • Technical specifications • Managing vendor copying • Managing on-site and off-site cool and cold vaults • Managing shelving for use copies
Historical Audiovisuals at NLM • New Ventures with Center for Information Technology (CIT) at NIH • Videocasting service of “history in the making” • Possible collaboration with NLM • Interlocking systems for preservation and cataloging • New venture for NLM in a large cache of digital materials
Historical Audiovisuals at NLM • New Library Research at NLM • NLM’s Lister Hill Center is looking at means of digital preservation • The origin of this conference—excited what it will bring
Analog Motion Picture and Tape Preservation at NLM – Duplication & Offsite Storage Walter Cybulski, Preservation Librarian Preservation & Collection Management Section, Public Services Division, NLM
Examples of Film and Tape Media in the NLM Collections 8mm 16mm 35mm 2” Quadruplex 1” Type C ¾” U-Matic ½” Beta
Nitrate added spice to the idea of deterioration – unfortunately, nothing but hot pepper (There are no nitrate film materials at NLM)
“250 TEASPOONFULS OF VINEGAR FOR A 1,000 FOOT CAN OF 35mm FILM”
Main Objectives of Preservation • Identify content that merits preserving • Mitigate against known risks • Extend useful life of content
Extend useful life : copy onto new media + = =
For libraries and archives, obtaining new copies may not be possible, and copying content on deteriorated media to the same media (e.g. 35mm to 35mm film transfer) can be prohibitively expensive
At this point, the most widely used AV preservation media are BetacamSP and Digital Betacam But the clock is ticking even as we copy content onto these formats…
with each technological advance, the storage picture changes …
WE ARE TRANSITIONING FROM FILMS AND TAPES TO DATA, BUT THE QUESTION REMAINS:HOW TO EXTEND THE USEFUL LIFE OF THE CONTENT 101010100110000101010101010101011010101010101010101001101010110 101010101010101010101001010101000100101010100101101001000101010101010101010101010101010101010101010111101001110101011010101010101010100 101010101010101010001010101010000101001010111001011010110101010101011010101010101010101010101010101010100101010101010101001010100110010 101010101010101010110011011111010101110101010000001011010101110
Getting to Disk-based Lossless Digital Video Preservation –Which Way Forward? Communications Engineering Branch, Lister Hill National Center for Biomedical Communications NLM Glenn Pearson, Ph.D. Senior Software Developer
Generational Loss Once Digital • Migration as preservation strategy • To cope with obsolescence of digital formats, gear • If using lossy image compression algorithms • No degradation when making exact copy Master Master • Degradation when migrating (or editing) Master uncompress recompress Master • Examples: M-JPEGs, DVs, MPEG-1, -2, most -4 • Mathematically-lossless algorithms • Avoid this problem • Don’t compress as well (2x – 4x) as “virtually lossless” (5x – 9x) or obviously lossy (web streaming)
Lossless Video Storage • Uncompressed video • Can be stored with general binary file compressors (RLL, LZW [zip] ), typically 1.6:1 - 2:1 compression • Lossless video codecs • Standardized, open (but may be patents) • HuffYUV – original, uses Huffman “entropy” encoding • Apple Quicktime “None” codec [documented, not standard] • JPEG 2000 Lossless (within, say, Motion JPEG 2000) • MPEG4/AVC Lossless • Proprietary • Matrox DigiSuite: Lossless = entropy-only portion of M-JPEG • New - MatrixView’s “Adaptive Binary Optimization”, from patented “Repetition Coded Compression” (boolean grids + Huffman)
Economics of Digital Storage $ per GigaByte Data is for computer tape, but digital video tape uses the same technology, which drives media price Sources: E. Grochowski & R. Halem, IBM Sys J, 42(2), 2003 (Disk, Flash) R. Harada, Comp Tech Rev, June 2004 (Tape)
Hierarchical storage yesterday: Hard Disks Tapes Hierarchical storage tomorrow: Flash Disks Hard Disks* *Powered on-demand The Twilight of Tape
Economics of Subsampling and Lossless Compression • Gold Standard for digital video: 4:4:4 uncompressed • Not so affordable today for archives In YUV colorspace: Y is luma (B&W intensity) U, V are red, blue color differences . respectively 4:4:4 = full sample/pixel 4:2:2 = sample for Y at full pixel resolution, for U, V at half resolution • 4:2:2 lossless • will be affordable 2 years before 4:4:4 uncompressed • stay ¼ the cost • When is 4:2:2 good enough for preservation?
Film Master Digital Master • Traditional good advice: Film Film • Can Film Digital be • as good as/better than Film Film • as affordable? • Quality of source • 8mm, 16mm, 35mm, 65/70m • B/W vs color • camera original, intermediate print, distribution print • Versus quality of target • HD video has1920x1080 (“e-Cinema”) • Variety matching film best: progressive-scan 24 fps (1080p24) • But video has but 8-10 bits linear/component – less than film’s range • Good enough for archiving some 16mm B&W distribution prints? • HD 16:9 aspect matches some sources, not others
Film Master Digital Master- Hollywood Style • Better than HD but $$ • 12 bit linear/component (36 bits/pixel) • Or 10-bit log/component • No subsampling • 2K @ 24 fps = most practical res. & rate • 2K = 2048 x 1080 • That’s outer bounds for various aspect ratios
3 Steps, 3 Types of File Formats • Sources (Production) • Digital Intermediate • Package for Theatrical Release
Sources • Computer Graphics • New cinema digital cameras • Viper, Dalsa Origin 4K, Arri D-20, Kinetta • Film Scanners • Kodak Genesis, Northlight, Arriscan, Imagina • “Datacines” (data telecines) • Thomson Spirit, Cintel DSX, Millennium • Raw, Unwrapped Frame-per-file Formats • Flexible resolution, aspect ratio • But sound, most metadata in separate files • Awkward: per-shot info • Examples • Kodak Cineon scanner .CIN (10-bit log rgb) • SMPTE std DPX (derived from Cineon) • Others: TIFF, SGI, EXR, JP2 • “Digital Negative” from 1-CCD camera with Bayer-pattern color filters atop pixels Magazine has 12 40 GB iPod Drives
Digital Intermediate Process • Creates Digital Masters • May include “Digital Source Master” from which multiple masters come: DVD master, TV master, DCDM • Typical Steps • Color grading, compositing, editing, finishing • Projects moved along in vendor formats or AAF • End products archived in vendor formats or MXF • Such unencrypted masters closely held by studios, but archivists could make their own
Theatrical Distribution • DCI Distribution Master (DCDM) • MXF wrapper + JPEG2000 frames • But lossy due to real-time bandwidth constraints (250 Mb/s peak) • Something Similar for Archivists? • a lossless variety of this • or MJ2 instead of MXF
Roadblocks in Getting to a Disk-Based Lossless Archive Master • Rapid digital-technology change • High current costs • Top quality needs massive storage, high-speed pipelines • An uncompressed color movie (2K @ 24 fps, 12-bit) • Would consume ~2 Gigabits per second bandwidth if realtime • Needs 0.8 TB storage per hour of length • Plus $$$ for color grading/restoration services & software • Analog tape SD digital is more affordable now • A proliferation of standards • File Formats • Essence representation/codecs/color spaces • Wrappers • Metadata & Rights Management • Can we help find a way forward?