1 / 9

1. Problem

1. Problem. Many archived two-sided manuscript documents suffer from bleed-through; Bleed-through can be effectively removed offline using image-processing algorithms; A remotely located researcher may want to access both original and corrected versions of a document;

elvis
Download Presentation

1. Problem

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 1. Problem • Many archived two-sided manuscript documents suffer from bleed-through; • Bleed-through can be effectively removed offline using image-processing algorithms; • A remotely located researcher may want to access both original and corrected versions of a document; • We want to avoid sending the document twice, since both versions are very similar. Recto Verso

  2. 3. Algorithm Details Registration • We assume that the continuous recto and verso image coordinate frames are related by a six-parameter affine transformation • We search for a parameter vector that gives the best match between the recto and the transformed flipped verso, in the least-squares sense • We identify the registered verso image

  3. 4. Joint Compression • Based on existing standards • Original, uncorrected image compressed with standard efficient compression scheme such as JPEG or JPEG 2000 • Segmentation map compressed using efficient bilevel compression scheme, such as JBIG or JBIG2 • Additional information for inpainting transmitted as side information + +  4.6Mbit 131 kbit

  4. 2. Bleed-through Removal Model • We assume the existence of underlying recto and verso images without bleed-though. These consist of the background, with the writing, superimposed. • These ideal recto and verso images are combined in some way to produce the observed recto and verso images corrupted with bleed-through (see above). • In general, the scanned recto and verso images (with bleed-through) will not be aligned. Recto and flipped verso images superimposed

  5. Segmentation • We segment each side of the document into the four regions R1-R4. However, it is most important to correctly identify region R2, ‘bleed-through only’. If we miss some parts of R2, bleed-through will remain. If the label R2 is incorrectly assigned to some parts of R1, ‘foreground only’ or R4 ‘foreground and bleed-through’, then parts of the desired writing will be erased. • We first identify points that can be considered to definitely be background (R3), because they are lighter than a certain threshold. • We then identify points that can be considered to foreground (R1), because they are darker than corresponding points on the other side. • Of the remaining points, those whose correlation between the two sides exceeds a correlation threshold are deemed to be bleedthrough (R2). The rest are assigned to R4.

  6. Original with bleed-through With bleed-through removal

  7. Algorithm • Registration: Alignment of recto and flipped verso • Segmentation: Four regions • R1: Foreground only • R2: Bleed-through only • R3: Background • R4: Foreground and bleedthroughoverlap • Inpainting: Region R2 filled in with estimate of background Recto and flipped verso images, superimposed after registration Illustration of four types of regions Inpainting applied to circled region

  8. Inpainting • Points labelled R2 ‘bleed-through’ are replaced by suitable nearby points from the background region R3. In the initial work, a fixed value was used.

  9. 5. Conclusion • Bleed-through can be effectively removed by jointly processing recto and verso sides of document. • More complex bleed-through removal algorithms can be used at the server side, with the result transmitted to the remote user. • It is not necessary to separately transmit original and corrected versions to a user who wishes to see both. • All elements can be incorporated into JPEG2000. • More work needs to be done on the segmentation and inpainting aspects of the algorithm.

More Related