1 / 10

Document Image Analysis CSE 717 An Introduction

Document Image Analysis CSE 717 An Introduction. Document Image Analysis. DIA is the theory and practice of recovering the symbol structures of digital images scanned from paper or produced by computer DIA is a subfield of Digital Image processing

gillum
Download Presentation

Document Image Analysis CSE 717 An Introduction

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Document Image Analysis CSE 717 An Introduction

  2. Document Image Analysis • DIA is the theory and practice of recovering the symbol structures of digital images scanned from paper or produced by computer • DIA is a subfield of Digital Image processing • Digital images of natural objects: X-rays, fingerprints, faces, scenery, etc. are NOT part of DIA • Digital images of symbolic objects: Postal addresses, printed articles, forms, music sheets, engineering drawings, topographic maps belong to DIA • Source: Scanners, printers, fax machines, hand! • Incidental text: license plates, billboards, subtitles, in photos and video • WWW ?? • DIA’s grand goal is take us to the land of paperless office

  3. Document Image Analysis Textual Processing Graphical Processing Optical Character Recognition Page Layout Analysis Line Processing Region and Symbol Processing Skew, blocks, paragraphs Lines, curves, corners Filled regions Text

  4. Document Image Analysis

  5. Meter Mark Digital Post Mark Sender’s Address Endorsement In Case of Undeliverable as Addressed Return to Sender Linear Code Delivery Address Postal Examples

  6. Forms

  7. Unconstrained Text

  8. Graphics Documents

  9. References • Handbook of Character Recognition and Document Image Analysis, H. Bunke and PSP Wang (editors), World Scientific Press • Document Image Analysis, Gorman and Kasturi , IEEE Computer Society Press • International Conference on Document Analysis and Recognition proceedings • International Workshop on Document Analysis Systems proceedings • Symposium on Document Image Understanding Technology

  10. OCR Features and Systems • Script ID, Devanagari OCR, Tamil OCR, MP versus HW • Handwriting Recognition • Postal applications, Arabic Documents • Classifiers and Learning • Multi-classifier systems • Layout Analysis • Skew correction, geometric methods, test/graphics separation, logical labeling • Tables and Forms • Detecting tables in HTML documents, use of graph grammars, semantics • Document Engineering • Processing of historical documents (palm leaf manuscripts). • Camera Based DIA • Locating and reading Barcodes • New Applications -CAPTCHA

More Related