1 / 45

Digitization of Visual Resources

Digitization of Visual Resources. Jenn Riley Metadata Librarian IU Digital Library Program. Technical overview. Analog to digital conversion Resolution Bit depth Color representation Reflectivity and polarity Compression. Analog to digital conversion.

gottlieb
Download Presentation

Digitization of Visual Resources

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Digitization of Visual Resources Jenn Riley Metadata Librarian IU Digital Library Program

  2. Technical overview • Analog to digital conversion • Resolution • Bit depth • Color representation • Reflectivity and polarity • Compression L597: Humanities Computing

  3. Analog to digital conversion • Image is converted to a series of pixels laid out in a grid • Each pixel has a specific color, represented by a sequence of 1s and 0s • Pixel-based images are called “raster” images or “bitmaps” L597: Humanities Computing

  4. Resolution (1) • Often referred to as “dpi” or “ppi” • RATIO of number of pixels captured per inch of original photo size • 8x10 print scanned at 300ppi = 2400 x 3000 pixels • 35mm slide (24x36mm!) scanned at 300ppi ≈ 212 x 318 pixels L597: Humanities Computing

  5. Resolution (2) • “Spatial resolution” refers to pixel dimensions of image, e.g., 3000 x 2400 pixels • Flatbed and film scanners have a fixed focus, so they know how big the original is; digital cameras don’t L597: Humanities Computing

  6. Bit depth (1) • Refers to number of bits (binary digits, places for zeroes and ones) devoted to storing color information about each pixel • 1 bit (1) = 21 = 2 shades (“bitonal”) • 2 bit (01) = 22 = 4 shades • 4 bit (0010) = 24 = 16 shades • 8 bit (11010001) = 28 = 256 shades (“grayscale”) L597: Humanities Computing

  7. Bit depth (2) 1 bit (black & white) 2 bit (4 colors) 4 bit (16 colors) 8 bit (256 colors) L597: Humanities Computing

  8. Color representation • RGB • Scanners generally have sensors for Red, Green, and Blue • Each of these “channels” is stored separately in the digital file • 8 bits for each of 3 channels = 24 bit color • CMYK (Cyan, Magenta, Yellow and Black) is used for high-end “pre-press” printing purposes L597: Humanities Computing

  9. Reflectivity and polarity L597: Humanities Computing

  10. Compression • Makes files smaller for storage • Files must be decompressed for viewing • Lossless • Lossy • “visually lossless” L597: Humanities Computing

  11. Technical questions? • Analog to digital conversion • Resolution • Bit depth • Color representation • Reflectivity and polarity • Compression L597: Humanities Computing

  12. Setting specifications • Standards & best practices • Capture once, use many • Determine purpose • Resolution • Bit depth & color • File formats • Quality control L597: Humanities Computing

  13. Standards & best practices • NARA Technical Guidelines for Digitizing Archival Materials for Electronic Access • Moving Theory into Practice • Book • Online tutorial • California Digital Library Digital Image Format Standards • Western States Digital Imaging Best Practices L597: Humanities Computing

  14. Capture once, use many • Create master image when scanning • Capture all “important” information • Meets all foreseeable needs • For long-term storage and later use • Create derivatives for specific uses later • Web delivery • Printing • Publication L597: Humanities Computing

  15. Determine purpose • Nature of materials • Artifactual • For content only • Capture all important information • But what’s “important”? • Not always “what people can see” L597: Humanities Computing

  16. Resolution (1) • Higher is not always better • Scan at highest resolution necessary to achieve your stated purpose, no higher chart from Cornell’s online digital imaging tutorial: <http://www.library.cornell.edu/preservation/tutorial/conversion/conversion-03.html> L597: Humanities Computing

  17. Resolution (2) • Color photographic materials • 3000-6000 pixels on the long side • 24-bit RGB • B/W photographic materials • 3000-6000 pixels on the long side • 8-bit grayscale (unless sepia tone is “important”) L597: Humanities Computing

  18. Resolution comparison (1) L597: Humanities Computing

  19. Resolution comparison (2) 600 dpi 300 dpi L597: Humanities Computing

  20. Bit depth & color • Match current photo or match original scene • Final master images should be 8 bits per channel (8-bit grayscale, 24-bit RGB); some specialized projects using higher bit depths • Any color adjustments & other processing should be done in scanning software before final scan is done • Use almost the full tonal range; avoid “clipping” L597: Humanities Computing

  21. Master file formats • TIFF (uncompressed) • Virtually unanimously recommended by digital imaging best practices • “De facto” standard • JPEG2000 • ISO/IEC IS 15444-1 | ITU-T T.800 • Not patent-free • Up-and-coming but not quite there yet • Supports embedded metadata • Uses wavelet-based compression L597: Humanities Computing

  22. Why not JPEG? • Lossy-compressed every time they are saved low compression, high quality high compression, low quality L597: Humanities Computing

  23. Delivery file formats • Photographic materials: JPEG • Text, line drawings: GIF • PNG? L597: Humanities Computing

  24. Quality control • Essential part of every digitization project • Objective criteria • Can be automated • Can check all items • Subjective criteria • Require human checks • Must sample L597: Humanities Computing

  25. Specifications questions? • Standards & best practices • Capture once, use many • Determine purpose • Resolution • Bit depth & color • File formats • Quality control L597: Humanities Computing

  26. More information • NARA Technical Guidelines for Digitizing Archival Materials for Electronic Access • Moving Theory into Practice • Book • Online tutorial • IU DLP Use of Digital Imaging Standards & Best Practices L597: Humanities Computing

  27. Visual Resources Metadata in Libraries Jenn Riley Metadata Librarian IU Digital Library Program

  28. What is metadata? “Data about data” L597: Humanities Computing

  29. A better definition • Other characteristics • Structure • Control • Origin • Machine-generated • Human-generated • In practice, often grows to cover data and meta-metadata L597: Humanities Computing

  30. Levels of control • Data structure standards • Data content standards L597: Humanities Computing

  31. Creating metadata • HTML <meta> tags • Spreadsheets • Databases • XML • Digital library content management systems • ContentDM • Greenstone L597: Humanities Computing

  32. Types of metadata • Descriptive metadata • Administrative metadata • Technical metadata • Preservation metadata • Rights metadata • Structural metadata L597: Humanities Computing

  33. Purposes of descriptive metadata • Description • Discovery L597: Humanities Computing

  34. General descriptive metadata standards • Dublin Core[example] • Unqualified • Qualified • MARC/AACR2[example] • MARCXML • MODS[example] L597: Humanities Computing

  35. Visual resources metadata standards • CDWA[example] • VRA Core[example] • CCO L597: Humanities Computing

  36. Technical metadata • For recording technical aspects of digital objects • Of use for long-term maintenance of data • Standards for still images • NISO Z39.87: Data Dictionary – Technical Metadata for Digital Still Images • MIX L597: Humanities Computing

  37. Structural metadata • For creating a logical structure between digital objects • Multiple copies of same bibliographic item • Multiple pages within item • Grouping of pages into sections • Multiple sizes of each page • METS is the current primary schema L597: Humanities Computing

  38. METS • metsHdr • dmdSec • amdSec • techMD • rightsMD • sourceMD • digiprovMD • fileSec • structMap • structLink • behaviorSec L597: Humanities Computing

  39. Some other specialized metadata formats • TEI • EAD[example] • GILS • CSDGM[example] • GEM • CIDOC L597: Humanities Computing

  40. Vocabularies • TGM I • TGM II • TGN • GeoNet • AAT • LCSH • LCNAF • DCMI Type • MIME Types L597: Humanities Computing

  41. Other considerations • Standard formatting • Repeatability of elements • Describing original vs. digitized item • Relationships between records • Interoperability L597: Humanities Computing

  42. Crosswalks • For transforming between metadata formats • Usually refers to transforming between content standards rather than structure standards, but not always • Good practice to create and store most robust metadata format possible, then create other views for specific needs L597: Humanities Computing

  43. The bottom line • Many concepts from a century and a half of library cataloging inform good metadata practices • But some re-examination needed for new environment • Need for automation • Need for smart people contributing! L597: Humanities Computing

  44. More information • Individual schema documentation • Caplan: Metadata Fundamentals for all Librarians, 2003 • NISO Press: Understanding Metadata • IFLA Functional Requirements for Bibliographic Records L597: Humanities Computing

  45. Thank you! • jenlrile@indiana.edu • These presentation slides: <http://www.dlib.indiana.edu/~jenlrile/presentations/slis/04fall/l597walsh/L597.ppt> L597: Humanities Computing

More Related