1 / 18

OCR at INIS Database Production & Imaging Group Yves Reynaud Y.Reynaud -Pulido @ iaea

OCR at INIS Database Production & Imaging Group Yves Reynaud Y.Reynaud -Pulido @ iaea.org. INIS Training Seminar 14-16 November 2011, Vienna, Austria. Some OCR features. We can find the needle in the haystack OCR offers a basic search from an unstructured document .

eliot
Download Presentation

OCR at INIS Database Production & Imaging Group Yves Reynaud Y.Reynaud -Pulido @ iaea

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. OCR at INISDatabaseProduction & ImagingGroupYves ReynaudY.Reynaud-Pulido@iaea.org INIS Training Seminar 14-16 November 2011, Vienna, Austria

  2. Some OCR features We can find the needle in the haystack • OCR offers a basic search from an unstructured document. • OCR bringstolifeyourdigitilazedcollection. • OCR adds an extra valuetoyourimage. INIS Training Seminar 14-16 November 2011, Vienna, Austria

  3. OCR is a computer technology software that • Translate images ­handwritten or typewritten text­ into machine-editable text. • Translate pictures of characters into a standard encoding scheme representing them (e.g. ASCII or Unicode). INIS Training Seminar 14-16 November 2011, Vienna, Austria

  4. Scanned Image (paper or micrographic) • Vector Image (created from native application) here a raster image for sake of comparison INIS Training Seminar 14-16 November 2011, Vienna, Austria

  5. “Do not see the trees (letters)try to see the forest (sentences)“ F0R 488UR1N6 7H3 L0N63V17Y 0F 1NF0RM4710N, P3RH4P8 7H3 M087 1MP0R74N7 R0L3 1N 7H3 0P3R4710N 0F 4 D16174L 4RCH1V3 18 M4N461N6 7H3 1D3N717Y, 1N736R17Y 4ND QU4L17Y 0F 7H3 4RCH1V38 1783LF 48 4 7RU873D 80URC3 0F 7H3 CUL7UR4L R3C0RD. INIS Training Seminar 14-16 November 2011, Vienna, Austria

  6. Verdana FOR ASSURING THE LONGEVITY OF INFORMATION, PERHAPS THE MOST IMPORTANT ROLE IN THE OPERATION OF A DIGITAL ARCHIVE IS MANAGING THE IDENTITY, INTEGRITY AND QUALITY OF THE ARCHIVES ITSELF AS A TRUSTED SOURCE OF THE CULTURAL RECORD. INIS Training Seminar 14-16 November 2011, Vienna, Austria

  7. Brush Script MT (Windows Font) FOR ASSURING THE LONGEVITY OF INFORMATION, PERHAPS THE MOST IMPORTANT ROLE IN THE OPERATION OF A DIGITAL ARCHIVE IS MANAGING THE IDENTITY, INTEGRITY AND QUALITY OF THE ARCHIVES ITSELF AS A TRUSTED SOURCE OF THE CULTURAL RECORD. INIS Training Seminar 14-16 November 2011, Vienna, Austria

  8. PCs≠ Humans • OCR compares patterns and selects closer match, it can be forced to a specific context but requires customization. • People adapt to circumstances and can circumvent misspellings if context is clear. INIS Training Seminar 14-16 November 2011, Vienna, Austria

  9. True or false Usually, an image is adequately sampled if each letter is at least two pixels in thickness: INIS Training Seminar 14-16 November 2011, Vienna, Austria

  10. Zoom in INIS Training Seminar 14-16 November 2011, Vienna, Austria

  11. Zoom in INIS Training Seminar 14-16 November 2011, Vienna, Austria

  12. Results from OCR It is in this context that I… … and an additional protocol on the basis… INIS Training Seminar 14-16 November 2011, Vienna, Austria

  13. Chinese in pixels INIS Training Seminar 14-16 November 2011, Vienna, Austria

  14. Chinese vector images from OCR 滤器 INIS Training Seminar 14-16 November 2011, Vienna, Austria

  15. Arabic in pixels INIS Training Seminar 14-16 November 2011, Vienna, Austria

  16. Arabic vector images from OCR هذ ا وشملت INIS Training Seminar 14-16 November 2011, Vienna, Austria

  17. InftyReader - an OCR System for Math Documents (12) where a . The indices now range from 1 to 5. The bosonic fields obey the commutation rules (13) INIS Training Seminar 14-16 November 2011, Vienna, Austria

  18. Thank you INIS Training Seminar 14-16 November 2011, Vienna, Austria

More Related