300 likes | 429 Views
Optimization of Line Segmentation Techniques for Thai Handwritten Document. Olarik Surinta Mahasarakham University Thailand. Introduction. In handwritten recognition, the line segmentation is an essential scheme.
E N D
Optimization of Line Segmentation Techniques for Thai Handwritten Document OlarikSurinta Mahasarakham University Thailand
Introduction • In handwritten recognition, the line segmentation is an essential scheme. • The occurrence of an inaccurately line segmentation will cause errors in the character segmentation. • Most of line segmentation techniques have been based on horizontal projection profile technique. SNLP2009
Introduction (cont) • The texts in most document images are aligned along horizontal lines. • Projection profile based techniques may be one of the most successful top-down algorithms SNLP2009
The characteristic of Thai character SNLP2009
The characteristic of Thai character (cont) Thai sentence structure SNLP2009
Based on the horizontal projection profile • The horizontal projection profile is used in dividing the text image into character line. SNLP2009
The line segmentation techniques in this research • 1. The horizontal projection technique • 2. The stripe technique • 3. The comparing Thai character technique • 4. The sorting and distinguishing • (all of techniques based on horizontal projection profile) SNLP2009
1. The horizontal projection profile Horizontal histogram result Image document SNLP2009
2. The stripe technique • Firstly, the stripe technique divides image into stripe (small column). • After that, the horizontal projection profile is used to divided the text image into character lines. SNLP2009
2. The stripe technique (cont) The result of strip technique for horizontal projection profile SNLP2009
3. Technique for comparing Thai character • This technique takes advantage of the differences in size of characters to differentiate Thai characters between consonants and a group of small vowels and tones. Comparing between consonant and a group of small vowel and tone SNLP2009
3. Technique for comparing Thai character (cont) • First step • The groups are divided into two groups (upper and lower zone). The higher group is then used to define the line from the image document SNLP2009
3. Technique for comparing Thai character (cont) The result of first step of comparison Thai character technique SNLP2009
3. Technique for comparing Thai character (cont) • Second step • consider the high value of white pixel between the line markers and choose a new line marker SNLP2009
3. Technique for comparing Thai character (cont) • this technique is complex as there are many steps to be proved. The result of comparing Thai character technique. SNLP2009
4. The new technique for sorting and distinguishing • This technique is not complicated and suitable for Thai character. • Firstly • Use the histogram of horizontal projection profile to sort the group of black pixels by starting with the minimum to maximum of black pixel. SNLP2009
4. The new technique for sorting and distinguishing (cont) Sorting the group of black pixels. SNLP2009
4. The new technique for sorting and distinguishing (cont) • Secondly • Find the maximum difference between two groups of black pixels • The line marker is marked on the middle of the group of black pixels when the maximum difference value is less than value of the group of black pixels SNLP2009
4. The new technique for sorting and distinguishing (cont) The result of second step of sorting and distinguishing technique. SNLP2009
4. The new technique for sorting and distinguishing (cont) • Finally • A new line marker is placed in the middle between every two conjunction line markers SNLP2009
4. The new technique for sorting and distinguishing Click me to play this video. SNLP2009
Experimental result • Thai image documents were generated from different peoples. • Data sets contained varieties of writing styles, and limited to only single-column Thai image documents. SNLP2009
Experimental result (cont) Single-Column Not Single-Column SNLP2009
Experimental result (cont) • The line marker is used to define the character line. • The line makers pass through the image document and do not cross the group of black pixels (line segment is completed). SNLP2009
Experimental result (cont) Complete line segmentation SNLP2009
Experimental result (cont) incomplete line segmentation SNLP2009
Experimental result (cont) T1 is Horizontal projection technique T2 is Stripe technique T3 is Comparing Thai character T4 is Sorting and distinguishing SNLP2009
Conclusion • I have presented four techniques for the line segmentation of Thai language • Horizontal projection profile • Stripe • Comparing Thai character • Sorting and distinguishing • 4 techniques based on horizontal projection profile SNLP2009
Conclusion (cont) • The accuracy of the techniques are • The horizontal projection technique 24.33% • The stripe technique 19.44% (suitable for English character and Oriya text) • The comparing Thai character technique 65.25% • The sorting and distinguishing technique 97.11% (complex, many steps to be proved) SNLP2009
Thank you Question & Answer SNLP2009