1 / 56

Session 703 Book to Computer: Scanning Basics

Session 703 Book to Computer: Scanning Basics. Gaeir Dietrich Director High Tech Center Training Unit of the California Community Colleges. Overview. Scanning and scanners Understanding scanning terminology Scanning workflow. Scanning. Scanning takes a picture.

esmeralda
Download Presentation

Session 703 Book to Computer: Scanning Basics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Session 703Book to Computer:Scanning Basics Gaeir Dietrich Director High Tech Center Training Unitof the California Community Colleges

  2. Overview • Scanning and scanners • Understanding scanning terminology • Scanning workflow CTEBVI Conference

  3. Scanning Scanning takes a picture. The better the picture, the less editing later on Similar technology to the copy machine but outputs to a digital file, not paper. CTEBVI Conference

  4. Stand Alone vs. Multi-use • Stand alone scanners… • Provide more control over scans • Result in better scans • Multi-use machines are copiers first, scanners second. • Final products require more editing during production • But it is still better than a flatbed scanner CTEBVI Conference

  5. Scanners • When buying a scanner think about these issues: • Duplex (two-sides at once) • Automatic feed (pages per minute) • Color (for color dropout) • We like Canon, followed by Fujitsu. • Canon DRC-125, DR-3010C CTEBVI Conference

  6. No Money? • A $400 20-page-per-minute scanner is a far better deal than four $100 flat-bed scanners • If you can only afford a flat-bed, look for one with automatic document feed (ADF) CTEBVI Conference

  7. Scanning Outputs Color scanning usually creates a JPEG. JPEGs are single pages only!! Black and white scanning creates a TIFF. TIFFs can be multiple pages. CTEBVI Conference

  8. What is a TIFF? TIFF files are graphics, i.e., pictures of text. Tagged Image File Format (TIFF) Robust, stable standard file type No version issues Any program that can open multipage graphics can open a TIFF Good archival graphical format CTEBVI Conference

  9. But I scan to… • If you get anything other than a TIFF or JPEG, you have used software to convert. • If you scan to PDF, you have used software to transform your file. • Scanning hardware does not create PDFs. • Conversion runs the risk of losing data and increasing editing time. CTEBVI Conference

  10. Converting TIFFs TIFF can be converted to other formats, including other graphic formats like PDF. To get to the text you must run a TIFF file through an optical character recognition (OCR) program. CTEBVI Conference

  11. Scanning Is the First Step • Settings for your scan will be determined by the end-format you want to create • For text, you will scan then run OCR • Optical Character Recognition • See session 901 on Sunday CTEBVI Conference

  12. Duplex vs. simplex Skew/deskew Margin control DPI (Resolution) Mode Brightness Contrast Threshold RGB color Color dropout Scanning Terms CTEBVI Conference

  13. Duplex vs. Simplex • Double-sided vs. single-sided • Duplex = two sides at a time (one pass) • Simplex = one side at a time • Flatbed scanners are simplex scanners • Look for true duplex (one pass) • Not two passes with the program interleaving the scans CTEBVI Conference

  14. Skew • Skew is slant • i.e., the page is not straight • Snug the feed guides! • Use deskew settings. • The computer can correct for some skew—too much and the text cannot be recognized CTEBVI Conference

  15. Margin Control • Scanner determines page size • Avoids large black areas around the edge of the page • On better machine, also removes need for measuring • Better scanners will also have margin adjustment • Note that usually *all* edges are adjusted the same amount. CTEBVI Conference

  16. DPI (Dots per Inch) • “Dots” in scanning are really pixels • Little squares like on graph paper • Imagine drawing by filling in squares on graph paper…the more squares, the smoother the lines • Higher DPI = better resolution • However, more is not always better! CTEBVI Conference

  17. DPI Comparison CTEBVI Conference

  18. Resolution—DPI • Standard for text is 300 DPI • Small text may require 400 DPI • Thin paper may require 150-200 DPI • Really large text may require 200 DPI • Infty Reader for math requires 600 DPI CTEBVI Conference

  19. Mode • Black & white • Looks like line art • Only choices for pixels are black or white • Grayscale • Looks like black & white photo • Also called “halftone” • Color • Comes in different “bits” • The more bits, the more color information CTEBVI Conference

  20. Black and White • Image scanned in B/W—file size 474 KB CTEBVI Conference

  21. Black and White ED • Image scanned in B&W ED (Canon DR 5080C)—file size 474 KB CTEBVI Conference

  22. Grayscale • Image scanned in Grayscale—file size 3,731 KB CTEBVI Conference

  23. Choosing the Mode • Black and white • Best for text; smallest file size • Black and white ED (error diffusion) • Better for graphics; slightly larger files • Usually best to avoid grayscale • Large files that do not OCR as well • Color • Sometimes necessary; large files CTEBVI Conference

  24. Which Mode to Choose? • It depends on how important the graphics are! • Is it for a student who has some usable vision and needs enlargement? • Grayscale or color may be needed • Is it to create braille? • Black and white will usually give the best OCR results. CTEBVI Conference

  25. Brightness • Overall darkness or lightness of page • Balance • Not too dark, not too light • Scale 1-255 • Lower numbers decrease brightness • Down into darkness • Higher numbers increase brightness • Up to the light CTEBVI Conference

  26. Brightness Example • It’s just like turning on lights over an entire room. CTEBVI Conference

  27. Adjusting Brightness • Default is 128 • Too dark • Letter shapes run together • Too light • Letter shapes are thin or broken • Newsprint type papers often need increased brightness CTEBVI Conference

  28. Brightness Guidelines • Check the appearance of the scan • If characters are thick and touching (running together) > increase brightness • If characters are thin and broken (lines thin/missing areas) > reduce brightness CTEBVI Conference

  29. Sample Scans • Too bright • Just right • Too dark CTEBVI Conference

  30. Contrast • Difference between light and dark on page • Scale is 1-13 • Higher number increases contrast • Darks darker, lights lighter • Lower number decreases contrast • Darks get lighter, lights get darker • Becomes more uniform CTEBVI Conference

  31. Contrast Example CTEBVI Conference

  32. Adjusting Contrast • Default is 7 • Low contrast • Entire page is either “muddy” looking • Or washed-out looking • High contrast • Extremes of light and dark • May lose midrange detail • Newsprint-type paper oftens need increased brightness CTEBVI Conference

  33. Threshold • In black and white mode • Sometimes just see brightness (contrast settings disappear) • Sets where gray will be seen • Increased threshold adds more white • More grays seen as white • Decreased threshold adds more black • More grays seen as black CTEBVI Conference

  34. Despeckle • “Erases” speckles • Helps with small stray black dots • Works really well when having to scan a photocopy or newsprint • Beware of going too far and erasing periods and umlauts CTEBVI Conference

  35. Gamma…it’s complicated… • Adjusts the middle tones • Usually more useful for scanning graphics than text • Can be altered to bring out more detail in shadows in photos • Usually only on high-end hardware • Try everything else first! CTEBVI Conference

  36. Settings Summary • Brightness = overall tone • Contrast = difference in highs and lows • Gamma = adjustment in midtones • Threshold = on or off switch for grays • Grays seen as white or black • May appear as just the “brightness” bar CTEBVI Conference

  37. RGB Color • RGB = Red, Green, Blue • RGB color system is used by TVs, computers, and scanners! CTEBVI Conference

  38. “Additive” Color System CTEBVI Conference

  39. Color Scanners • Many color scanners for documents allow “color dropout” • The scanner “ignores” a particular color • “Erases” the color • Red, blue, or green CTEBVI Conference

  40. Color Dropout • Drop out colored markings • Orange highlighter (drop out red) • Blue pen (drop out blue and despeckle) • Yellowish pages • Drop out red (improves contrast) • Tinted backgrounds • Watch out for dropping out text • Be aware of color with white text on it CTEBVI Conference

  41. Scanned Page with Orange Highlighter CTEBVI Conference

  42. Same Page with Red Drop-out CTEBVI Conference

  43. Scanning Workflow Remove spine from book Separate any pages still glued together Choose a few representative pages for a test scan CTEBVI Conference

  44. Procedure Continued • Scan representative pages to TIFF • Check image on screen for possible adjustments • Run OCR on sample pages • Error rate should be no higher than one per page • Higher errors mean you need to adjust the scanner settings CTEBVI Conference

  45. Ready to Scan • With the settings determined, scan the entire book • Now that you have a good picture, your OCR and editing should go quickly! CTEBVI Conference

  46. Advanced Ideas • Be aware of individual pages that may need additional adjustment • A few pages may need to be scanned separately • A few pages may need color • Reassemble in your OCR program • While checking test pages, also create OCR templates as appropriate CTEBVI Conference

  47. Suggestion on Organizing Files Structure Label chapters (or chapter folders): 01 Chapter 02 Chapter Label front matter to place it first: 00 Front Matter Label back matter just with its name: Back Matter This file structure will create a logical order. CTEBVI Conference

  48. Example CTEBVI Conference

  49. Timesaver: Create a Template Folder • The template folder can be copied and pasted—all the inside folders are copied, as well! • Putting the zero in front makes the folder easy to find. CTEBVI Conference

  50. Miscellaneous Tips • Chopping books • Guillotine • Exacto knife to remove spine and check with Fed Ex Office (Kinko’s) about cutting the pages • Spines and flatbeds • If you have to scan a book with a thick spine on a flatbed, get a large dark piece of cloth and cover the scanner—prevents the darkened area along the spine CTEBVI Conference

More Related