1 / 29

Optical Character Recognition: Using the Ullman Algorithm for Graphical Matching Iddo Aviram

Optical Character Recognition: Using the Ullman Algorithm for Graphical Matching Iddo Aviram. OCR- a Brief Review.

ianna
Download Presentation

Optical Character Recognition: Using the Ullman Algorithm for Graphical Matching Iddo Aviram

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Optical Character Recognition:Using the Ullman Algorithm for Graphical MatchingIddoAviram

  2. OCR- a Brief Review • Optical character recognition, usually abbreviated to OCR, is the mechanical or electronic translation of scanned images of handwritten, typewritten or printed text into machine-encoded text. • OCR is a task, and not amathematically defined problem.

  3. OCR- a Brief Review • People are using many disciplines for OCR. • We will show just a simple, not representative, approach to deal partly with the OCR task. Decision Making Fourier Transforms Expert Systems Topology Machine Learning Pattern Matching Neural Networks Optimization Problems Differential Geometry Computer Vision

  4. OCR- a Brief Review • The task can be very hard, and state-of-the-art algorithms might be not good enough for some practical purposes. In several cases, however, OCR tools can perform well and be useful.

  5. OCR- a Brief Review • The human brain does amazingly well with OCR tasks, so usually the computer results are evaluated by a comparison with a manually created ground truth data. • However, sometimes even humans are not capable of recognition.

  6. OCR- a Brief Review • Can you read these scripts? נדלן בפתח תקווה: (למעלה: מתוך yad1.co.il, 2012 למטה: מתוך ה"חבצלת" 1912)

  7. OCR- a Brief Review • Can you read this script? גרסה מוקדמת ל"שיר כאב" + "שיר מים רבים", מאיר אריאל, סוף שנות ה-70

  8. OCR- a Brief Review • Can you read this script? כתובת על חרס (אוסטרקון) -חורבת עוזה תקופת הברזל II, המאה ה-7 לפני הספירה דיו על חרס רשות העתיקות “אֹמֶר למלך אֱמֹר לְבִלְבֵּל: הֲשָלֹם אַתָּ? והִבְרַכְתִּךָ לְקוֹס. וְעַתָּ תֵּן אֶת הָאֹכֶל אֲשֶר עִמַּד אֲחִאִמֹּה [ ] וְהֵרִם ע[ז]אל עַל מִזְ[בַּח קוֹס פֶּן יֶ]חְמַר הָאֹכֶל.”

  9. OCR- Motivation for Graphical Matching • Using graphical tools for object recognition. • A possible scheme: • Binarization • Segmentation by connected components • Thinning • Graphical modeling • Graphical matching • Rule-Based Selection

  10. OCR- Motivation for Graphical Matching • Binarization:

  11. OCR- Motivation for Graphical Matching • Segmentation-> Thinning-> Graphical modeling:

  12. OCR- Motivation for Graphical Matching • Given an historical manuscript, a blessing of Brit Milah:

  13. OCR- Motivation for Graphical Matching • We’re interested in finding the occurrences of the letter Mem (not final):

  14. OCR- Motivation for Graphical Matching • By sub-graph matching we can find candidates: Graphical matching Graphical modeling

  15. Subgraph Isomorphism Problem • Given two graphs H and G as input, the problem is whether H has a subgraph that is isomorphic to G. • In this example the answer is ‘yes’ since there’s an isomorphic correspondence: 1G-1H,2G-3H,3G-2H. (There are additional isomorphic correspondences).

  16. Subgraph Isomorphism Problem • Graph isomorphism • Graphs G(VG,EG) and H(VH,EH) are isomorphic if |VG|=|VH| and there is an invertible function F from VG to VH such that for all nodes u and v in VG, (u,v)∈EG if and only if (F(u),F(v)) ∈EH. • Such a function F is said to be an isomorphic correspondence.

  17. Subgraph Isomorphism Problem • The subgraph problem is NP-complete. • There is a very simple reduction: CLIQUE ≤P Subgraph Isomorphism • However, for many specific types of practical problems (even with ‘big’ inputs), algorithms do answer fast.

  18. The Ullman Algorithm • An Algorithm for Subgraph Isomorphism, J. R. Ullmann, Journal of the ACM, 1976. • Although old, this algorithm is still very popular and having good results in practice.

  19. The Ullman Algorithm • There are algebraic formulations for graph isomorphism and subgraph isomorphism, that we will take use of. • The adjacency matrix AH of a graph H would be:

  20. The Ullman Algorithm • We will use the notion ofa permutation matrix. • Any permutation matrix is equivalent to an isomorphic correspondence. Isomorphic Correspondence Permutation Matrix - - - - M’= F= F~M’

  21. The Ullman Algorithm • Two graphs and are isomorphic with a correspondence F  is similar to , and the similarity matrix is M’~F. Isomorphic Correspondence Permutation Matrix ~ - - - - Isomorphism criterion: M’= F= iff is isomorphic to H, with a correspondence F~M’. F~M’

  22. The Ullman Algorithm • We can develop this equation that defines an isomorphism: Since is a symmetric matrix Since M’ is an orthonormal matrix, thus =I Isomorphism criterion: iff is isomorphic to H, with a correspondence F~M’.

  23. The Ullman Algorithm • In a similar fashion (without proof) we have an algebraic criterion for a subgraphisomorphism. Isomorphic Correspondence Permutation Matrix ~ 1G-1H 2G-3H 3G-2H 4G-φ Subgraph isomorphism criterion: M’= F= iff G is subgraph isomorphic to H, with a correspondence F~rectangularM’.

  24. The Ullman Algorithm • We have a graph G and a graph H, and we want to know if G is subgraph isomorphic to H . • So, We’ll search for a permutation matrix M* of size |x || that satisfies the subgraph isomorphism criterion. • We will enumerate over candidate permutation matrices of the same size, denoting a candidate by M’, from a set of candidates that satisfies: (The set of all M*-s) (The set of all M’-s) . During the enumeration, we check the isomorphism criterion over each candidate. If a candidate satisfies the criterion, we will return ‘yes’. If we would not find such a candidate, we will return ‘no’.

  25. The Ullman Algorithm • Ullmann’s algorithm I • Construction of another matrix M(0) with the same size of the M’-s: • Generation of all M’-s by setting to 0all but one 1 in each row of M(0) • A subgraph isomorphism has been found if M implies: .

  26. The Ullman Algorithm Root - M(0) • Ullmann’s algorithm I • Example Inner Nodes – M-s Leaves – M’-s

  27. The Ullman Algorithm • Ullman’s algorithm II • Construction of another matrix M(0) with the same size of the M’-s: • Generation of all M‘-s by setting to 0all but one 1 in each row of M(0) . However, in this version, we will also prune all inner nodes M-s that have at least one 1 entry that doesn‘t comply with the refinement rule (to be defined). We are guaranteed to end up with the right answer since we still hold: (The set of all M*-s) (The set of all M’-s) • A subgraph isomorphism has been found if there is M‘ that satisfies .

  28. The Ullman Algorithm • Ullmann’s refinement rule for prunning the search tree: • Observation: • If a vertex of G, , corresponds to a vertex of H, , then for each adjacent vertex of in G, denoted , there must be a vertex in H, denoted , in H that holds: • A. is adjacent to in H • B. corresponds to

  29. The Ullman Algorithm • Algebraic notation: • For all mi,j=1 (that is already fixed): • Any inner node M that does not satisfy this rule is prunned, because all of its decendants are not M*-s.

More Related