430 likes | 440 Views
Efforts to understand, manipulate, and archive valuable historical handwritten manuscripts through digitization, increasing accessibility and allowing for automatic processing. Provides insights into tangible and intangible cultural aspects from the past.
E N D
Motivation • Historical handwritten manuscripts are valuable cultural heritage • Providing insights into both tangible and intangible cultural aspects from the past • Efforts to understand, manipulate and archive historical manuscripts • Digitizationincreases accessibility and allows automatic processing *Courtesy: - wadod.com - Genizah Project
Outline • Background • Challenges • Seam Carving • Text line representation by seams • Energy Map • Seam Generation • Experimental Results • Summary
Image representation N x M (Matrix)
Binarization # pixels intensity
Connectivity & Components • We can define 4- or 8-paths depending on the type of connectivity specified • A set of pixels S is a Connected • Componentiffor each pixel pair • (x1,y1) є S and (x2,y2) є S there • is a path between them such that • every two successive pixels in the path • are in S and are X-neighbors. (X = 4, 8). 8-Neighborhood 4-Neighborhood
Connected Component One word, but 3connected components
Distances • Given 2 points P = (u,v) , Q = (x,y) • Euclidean Distance • City Block Distance • Chessboard Distance • In example: P = (1,8); Q = (4,1)
Distance transform • Given a set of pixels S, calculate the distance of other pixels to S • The pixels in the set S will be considered as reference pixels • Let . We scan the image by a pre-defined connectivity : • First pass: Consider Green pixels (N1)
Distance transform • In reverse scan, consider Blue pixels (N2) First scan Distance transform
Distance transform – (cont’d) Alef Letter - Arabic Printed Handwritten Binary Representation Distance transform Chessboard metric = Reference pixels
Sign Distance transform Alef Letter Printed Handwritten Sign Distance transform chessboard metric
Sign Distance transform – (cont’d) • The brighter the color the larger the distance from reference pixels Original Document Image Sign Distance transform (SDT)
Gradient • A gray-scale image I is defined as a two-dimensional function I(x,y)=gray • The gradient of the image (I ) is given by the formula : Where: • is the derivative of the image in the horizontal direction • is the derivative of the image in the vertical direction • The magnitude of the gradient is defined by:
Background Pre-Processing Segmentation Original *Courtesy: Islamic manuscript, Leipzig University Library, Germany
Text-line Extraction Assigning the same color to each text line ب ت ث يــجـ خـ حـ Original Manuscript Processed Manuscript *Courtesy: Juma Al-majid Center for Culture and Heritage, Dubai.
Outline • Background • Challenges • Seam Carving • Text line representation by seams • Energy Map • Seam Generation • Experimental Results • Summary
Challenges Historical handwritten documents pose different challenges than those in machine-printed. • Looser layout format • Line Proximity • Multi-Oriented lines • Touching components • Different slope (within the same line) • Delayed strokes • Overlapping components A 19th century master thesis – SAAB medical Library, American University of Beirut
Outline • Background • Challenges • Seam Carving • Text line representation by seams • Energy Map • Seam Generation • Experimental Results • Summary
Seam Carving • Content-aware image resizing • An energy function defines energy value for each pixel • A seam is an optimal 8-connected path of low energy pixels Original Image Calculated seams Gradient Image Resized
Seam Carving – (cont’d) • let I be an n x m size image. Define a vertical seam to be: where x is a mapping x : [1, . . . ,n] [1, . . . ,m]. • Seam contains one, and only one, pixel in each row of the image, otherwise a distorted image might be obtained. • The pixels of the path of a seam will therefore be : • one can change the value of K in the constraint, and get either a simple column for k = 0 , or even completely disconnected set of pixels.
Seam Carving – (cont’d) • Given an energy function e, the cost of a seam is: • We look for the optimal seam s* that minimizes this cost : • The optimal seam can be found using Dynamic programming
Outline • Background • Challenges • Seam Carving • Text line representation by seams • Energy Map • Seam Generation • Experimental Results • Summary
Text line representation by seams • Human perception of text lines • Tracks text lines by ink concentration and in-between line spaces • Two types of seams have been defined *Courtesy: Wadod Center for masnuscripts.
Text line representation by seams-(cont’) • The medial seam crosses the text area of a text line. • ASeparating seam is a path that passes between two consecutive text lines. Original Document Image Seam Seed Medial Seam Separating Seam Processed *Courtesy: Wadod Center for masnuscripts.
Outline • Background • Challenges • Seam Carving • Text line representation by seams • Energy Map • Seam Generation • Experimental Results • Summary
Energy Map • We use the Sign distance transform (SDT) as an energy map • In SDT, pixels values are assigned according to their distance from the nearest reference pixel • Recall, distance values are negative inside connected components and positivein-between • Intuition: Local minima and maxima points determine the medial and separating seams, respectively Original Document Image Sign Distance Transform (SDT) *Courtesy: Wadod Center for masnuscripts
Outline • Background • Challenges • Seam Carving • Text line representation by seams • Energy Map • Seam Generation • Experimental Results • Summary
Seam Generation – (cont’d) • The SDT is traversed horizontally to compute a cumulative energy map - Seam Map - for all possible connected seams for each entry (i,j): • SDT is traversed with two passes to enhance text line patterns Sign distance transform • Bi-linearly interpolate the resulting two maps Right-to-left pass Left-to-right pass Interpolated map
Seam Generation – (cont’d) • The minimal entry of the last column is detected. • Backtrack from the minimal entry to find the medial seam. Original Document Image Seam Map – One pass Seam Map – Two passes
Seam Generation – (cont’d) • Iteratively, all text lines will be extracted
Seam Generation – (cont’d) • Then, why separating seams are needed? • Avoid recalculation of energy and seam maps after each line extraction • Avoid additional strokes classification (post processing)
Seam Generation – (cont’d) • Separating seams define the boundaries of text lines • Generated with respect to the medial seam of the corresponding text line • Grown from seam seeds toward the two sides of the image guided by the SDT
Seam Generation – (cont’d) • Seam fragment is a connected group of pixels defined as the closest local maxima along the vertical direction • Seam fragments with low priority are discarded • Seeds candidate set is constructed • The seed that generates the optimal (maximal cost) seam was chosen Medial Seam Seam Map Sign Distance Transform
Seam Generation – (cont’d) • The separating seams may diverge from the medial seamdue to the fork of ridges • A spring force anchored at the medial seamguides the separating seams Before After
Touching/Overlapping Components • Usually, crossing overlapping components is avoided gracefully • Touching components are split too, but not necessarily in the optimal position Processed Processed
Outline • Background • Challenges • Seam Carving • Text line representation by seams • Energy Map • Seam Generation • Experimental Results • Summary
Experimental Results- (cont’d) Table 1: correctness of text line extraction Table 2: crossed components
Outline • Background • Challenges • Seam Carving • Text line representation by seams • Energy Map • Seam Generation • Experimental Results • Summary
Summary • Summary • Language independent approach • Dynamic programming was used to find text lines • Saves energy map re-computing after text line extraction • Post processing steps are avoided • Crossing overlapping components was avoided in most cases • Still need more research to split touching components optimally