1 / 21

Advanced OCR with OmniPage and FineReader

Advanced OCR with OmniPage and FineReader. Overview. Optical character recognition Structural recognition Options Loading Zoning OCR Editing. Optical Character Recognition (OCR). OCR turns pictures of text into e-text Does well unless… The picture is fuzzy The contrast is poor

kuniko
Download Presentation

Advanced OCR with OmniPage and FineReader

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Advanced OCRwith OmniPage and FineReader

  2. Overview • Optical character recognition • Structural recognition • Options • Loading • Zoning • OCR • Editing

  3. Optical Character Recognition (OCR) • OCR turns pictures of text into e-text • Does well unless… • The picture is fuzzy • The contrast is poor • The font is unusual • The font is too small or too large • The material has unusual characters

  4. Structural Recognition • Analyzes the layout of the page • Columns • Headings • Graphics • Tables • Usually does fairly well, unless the layout is non-standard

  5. Programs that Run OCR • Programs for consumers • Kurzweil 1000, 3000 • OpenBook • Intel Reader • Many others… • Programs for production • ABBYY FineReader • Nuance OmniPage

  6. Consumer Programs • Highly automated • Designed for individuals who have print disabilities • Are not good production tools • Do not provide flexibility • Do not allow much overriding • Interfaces not designed for editing

  7. Production Programs in General • A good program for production allows you to… • Control the zones (areas or blocks of text and graphics) • Add, delete, change • Edit easily • Improve recognition

  8. Preferred Programs • ABBYY FineReader • Relatively easy to learn • Fairly intuitive • Good structural recognition • Nuance OmniPage • Less intuitive but more accessible • Often does better with technical materials

  9. Both Good Tools • If you can afford to have both, it’s nice, but not absolutely necessary. • If you have both, run a couple test pages through each to see which is doing better on a particular job.

  10. Under the Hood • For best results with a program, set up your options before you begin! • Tools > Options

  11. Lots of Languages • FineReader and OmniPage handle multiple languages. • For foreign language, turn on all the languages in the book. • It will recognize the diacritical marks. • Turn on what you need, but only what you need.

  12. Math • If you are running OCR on math, try turning on Greek. • Greek will allow the program to recognize alphas, deltas, sigmas, etc.

  13. Another Decision • Detect page orientation or not? • Does not always get it right • Try it if you have many pages turned

  14. Considerations • You may or may not want to keep headers and footers. • I generally keep them to pull the page numbers. • You may want to keep the page breaks. • Retaining page breaks helps to maintain one-to-one page correspondence with the book.

  15. Fitting Everything • In some cases, you may need to work with a custom paper size to fit everything onto one page. • This feature can be helpful when you are retaining everything on the page but not the layout.

  16. Loading Files • “Open” • Opens saved program files • “Load” • Loads image files to process • Note that this same issue comes up with saving!

  17. Wizards Are Evil… • Do not rely on the automation • Load the image file and choose the processes you want

  18. Workspace • The program has three primary areas • Pages Pane • Either thumbnails or details • Allows simple navigation of pages • Image Pane • Your graphic • Text Pane • Area where the text from OCR will show

  19. More Accessible • Both programs have a detail view. • Shows text instead of graphics • Detail view is more accessible for screen readers. • Otherwise, it is personal preference.

  20. Two Ways to Save • To Save the program file to access later in the OCR program, choose File > Save • This saves your work file. • You save your converted file during the last phase of the processing.

  21. Production Tips • Work with dual monitors • Check your computer and video card • Stretching an OCR program across two monitors is a HUGE time-saver! • Learn to use keyboard shortcuts. • They save tons of time!

More Related