450 likes | 470 Views
Digitization of Visual Resources. Jenn Riley Metadata Librarian IU Digital Library Program. Technical overview. Analog to digital conversion Resolution Bit depth Color representation Reflectivity and polarity Compression. Analog to digital conversion.
E N D
Digitization of Visual Resources Jenn Riley Metadata Librarian IU Digital Library Program
Technical overview • Analog to digital conversion • Resolution • Bit depth • Color representation • Reflectivity and polarity • Compression L597: Humanities Computing
Analog to digital conversion • Image is converted to a series of pixels laid out in a grid • Each pixel has a specific color, represented by a sequence of 1s and 0s • Pixel-based images are called “raster” images or “bitmaps” L597: Humanities Computing
Resolution (1) • Often referred to as “dpi” or “ppi” • RATIO of number of pixels captured per inch of original photo size • 8x10 print scanned at 300ppi = 2400 x 3000 pixels • 35mm slide (24x36mm!) scanned at 300ppi ≈ 212 x 318 pixels L597: Humanities Computing
Resolution (2) • “Spatial resolution” refers to pixel dimensions of image, e.g., 3000 x 2400 pixels • Flatbed and film scanners have a fixed focus, so they know how big the original is; digital cameras don’t L597: Humanities Computing
Bit depth (1) • Refers to number of bits (binary digits, places for zeroes and ones) devoted to storing color information about each pixel • 1 bit (1) = 21 = 2 shades (“bitonal”) • 2 bit (01) = 22 = 4 shades • 4 bit (0010) = 24 = 16 shades • 8 bit (11010001) = 28 = 256 shades (“grayscale”) L597: Humanities Computing
Bit depth (2) 1 bit (black & white) 2 bit (4 colors) 4 bit (16 colors) 8 bit (256 colors) L597: Humanities Computing
Color representation • RGB • Scanners generally have sensors for Red, Green, and Blue • Each of these “channels” is stored separately in the digital file • 8 bits for each of 3 channels = 24 bit color • CMYK (Cyan, Magenta, Yellow and Black) is used for high-end “pre-press” printing purposes L597: Humanities Computing
Reflectivity and polarity L597: Humanities Computing
Compression • Makes files smaller for storage • Files must be decompressed for viewing • Lossless • Lossy • “visually lossless” L597: Humanities Computing
Technical questions? • Analog to digital conversion • Resolution • Bit depth • Color representation • Reflectivity and polarity • Compression L597: Humanities Computing
Setting specifications • Standards & best practices • Capture once, use many • Determine purpose • Resolution • Bit depth & color • File formats • Quality control L597: Humanities Computing
Standards & best practices • NARA Technical Guidelines for Digitizing Archival Materials for Electronic Access • Moving Theory into Practice • Book • Online tutorial • California Digital Library Digital Image Format Standards • Western States Digital Imaging Best Practices L597: Humanities Computing
Capture once, use many • Create master image when scanning • Capture all “important” information • Meets all foreseeable needs • For long-term storage and later use • Create derivatives for specific uses later • Web delivery • Printing • Publication L597: Humanities Computing
Determine purpose • Nature of materials • Artifactual • For content only • Capture all important information • But what’s “important”? • Not always “what people can see” L597: Humanities Computing
Resolution (1) • Higher is not always better • Scan at highest resolution necessary to achieve your stated purpose, no higher chart from Cornell’s online digital imaging tutorial: <http://www.library.cornell.edu/preservation/tutorial/conversion/conversion-03.html> L597: Humanities Computing
Resolution (2) • Color photographic materials • 3000-6000 pixels on the long side • 24-bit RGB • B/W photographic materials • 3000-6000 pixels on the long side • 8-bit grayscale (unless sepia tone is “important”) L597: Humanities Computing
Resolution comparison (1) L597: Humanities Computing
Resolution comparison (2) 600 dpi 300 dpi L597: Humanities Computing
Bit depth & color • Match current photo or match original scene • Final master images should be 8 bits per channel (8-bit grayscale, 24-bit RGB); some specialized projects using higher bit depths • Any color adjustments & other processing should be done in scanning software before final scan is done • Use almost the full tonal range; avoid “clipping” L597: Humanities Computing
Master file formats • TIFF (uncompressed) • Virtually unanimously recommended by digital imaging best practices • “De facto” standard • JPEG2000 • ISO/IEC IS 15444-1 | ITU-T T.800 • Not patent-free • Up-and-coming but not quite there yet • Supports embedded metadata • Uses wavelet-based compression L597: Humanities Computing
Why not JPEG? • Lossy-compressed every time they are saved low compression, high quality high compression, low quality L597: Humanities Computing
Delivery file formats • Photographic materials: JPEG • Text, line drawings: GIF • PNG? L597: Humanities Computing
Quality control • Essential part of every digitization project • Objective criteria • Can be automated • Can check all items • Subjective criteria • Require human checks • Must sample L597: Humanities Computing
Specifications questions? • Standards & best practices • Capture once, use many • Determine purpose • Resolution • Bit depth & color • File formats • Quality control L597: Humanities Computing
More information • NARA Technical Guidelines for Digitizing Archival Materials for Electronic Access • Moving Theory into Practice • Book • Online tutorial • IU DLP Use of Digital Imaging Standards & Best Practices L597: Humanities Computing
Visual Resources Metadata in Libraries Jenn Riley Metadata Librarian IU Digital Library Program
What is metadata? “Data about data” L597: Humanities Computing
A better definition • Other characteristics • Structure • Control • Origin • Machine-generated • Human-generated • In practice, often grows to cover data and meta-metadata L597: Humanities Computing
Levels of control • Data structure standards • Data content standards L597: Humanities Computing
Creating metadata • HTML <meta> tags • Spreadsheets • Databases • XML • Digital library content management systems • ContentDM • Greenstone L597: Humanities Computing
Types of metadata • Descriptive metadata • Administrative metadata • Technical metadata • Preservation metadata • Rights metadata • Structural metadata L597: Humanities Computing
Purposes of descriptive metadata • Description • Discovery L597: Humanities Computing
General descriptive metadata standards • Dublin Core[example] • Unqualified • Qualified • MARC/AACR2[example] • MARCXML • MODS[example] L597: Humanities Computing
Visual resources metadata standards • CDWA[example] • VRA Core[example] • CCO L597: Humanities Computing
Technical metadata • For recording technical aspects of digital objects • Of use for long-term maintenance of data • Standards for still images • NISO Z39.87: Data Dictionary – Technical Metadata for Digital Still Images • MIX L597: Humanities Computing
Structural metadata • For creating a logical structure between digital objects • Multiple copies of same bibliographic item • Multiple pages within item • Grouping of pages into sections • Multiple sizes of each page • METS is the current primary schema L597: Humanities Computing
METS • metsHdr • dmdSec • amdSec • techMD • rightsMD • sourceMD • digiprovMD • fileSec • structMap • structLink • behaviorSec L597: Humanities Computing
Some other specialized metadata formats • TEI • EAD[example] • GILS • CSDGM[example] • GEM • CIDOC L597: Humanities Computing
Vocabularies • TGM I • TGM II • TGN • GeoNet • AAT • LCSH • LCNAF • DCMI Type • MIME Types L597: Humanities Computing
Other considerations • Standard formatting • Repeatability of elements • Describing original vs. digitized item • Relationships between records • Interoperability L597: Humanities Computing
Crosswalks • For transforming between metadata formats • Usually refers to transforming between content standards rather than structure standards, but not always • Good practice to create and store most robust metadata format possible, then create other views for specific needs L597: Humanities Computing
The bottom line • Many concepts from a century and a half of library cataloging inform good metadata practices • But some re-examination needed for new environment • Need for automation • Need for smart people contributing! L597: Humanities Computing
More information • Individual schema documentation • Caplan: Metadata Fundamentals for all Librarians, 2003 • NISO Press: Understanding Metadata • IFLA Functional Requirements for Bibliographic Records L597: Humanities Computing
Thank you! • jenlrile@indiana.edu • These presentation slides: <http://www.dlib.indiana.edu/~jenlrile/presentations/slis/04fall/l597walsh/L597.ppt> L597: Humanities Computing