280 likes | 483 Views
Joint Bi-Level Image Experts Group. ( JBIG ). JBIG. Joint Bi-Level Image Experts Group (JBIG), reports both to ISO/IEC JTC1/SC29/WG11 and ITU-T SG 8. international standard for lossy and lossless bi-level image compression informally known as JBIG or JBIG1, now JBIG2
E N D
Joint Bi-Level Image Experts Group ( JBIG )
JBIG Joint Bi-Level Image Experts Group (JBIG), reports both to ISO/IEC JTC1/SC29/WG11 and ITU-T SG 8. international standard for lossy and lossless bi-level image compression informally known as JBIG or JBIG1, now JBIG2 ISO - International Organization for Standardization. IEC - International Electrotechnical Commission. JTC 1 - Joint ISO/IEC Technical Committee on information technology. SC 29 - subcommittee responsible for coding of audio, picture, multimedia, and hypermedia information. WG 1 - working group that deals with coding of still pictures; it includes both JBIG and JPEG, the Joint Photographic Experts Group. ITU-T - Telecommunication Standardization Sector of the International Telecommunication Union. SG8 is the study group that deals with characteristics of telematic system.
JBIG1 • JBIG1 is a lossless image compression standard from the group. • JBIG1 was designed for compression of binary images, particularly for faxes, but can also be used on other images. • JBIG1 uses arithmetic coding patented by IBM known as the Q-coder. • JBIG1 also supports progressive transmission with small (around 5%) overheads.
JBIG1 Patents • For many years, the doubts about the patent situation have prevented the JBIG1 standard from becoming widely used on the Internet (e.g., not a single web browser has support for it integrated into it). • The last JBIG1 patent is believed to expire on 12 February 2011. • http://www.cl.cam.ac.uk/~mgk25/jbigkit/patents.html
Design Goal of JBIG2 Lossy Compression of bi-level images. + Better lossless compression performance First international standard that provides so; the existing standards are strictly lossless.
Pattern Matching for Text Image Data For textual images, character-based pattern-matching techniques are used. • We code the bitmap of the character at its first occurrence and put it into a “dictionary”; instead of coding all the pixels each time the character re-occurs on a page. • Such a bitmap is aka a ‘mark’ or ‘pixel block’.
Two Encoding Methods • Pattern Matching and Substitution (PM Sub) • Soft Pattern Matching (Soft PM) The methods differ substantially in how they encode the ‘pixel blocks’
1) Pattern Matching and Substitution For each character, we code a • Pointer to the matching representative bitmap in the dictionary, + • Position of the character on the page. If there is no acceptable match, we code the pixel-block directly and add it to the dictionary.
Steps in PM & Sub. Method (i) Segmentation (ii) Dictionary Search (iii) Coding of numerical data (iv) Coding of bitmap
(i) Segmentation An image is segmented into ‘marks’ or ‘pixel-blocks’ • Connected components of black pixels are considered a ‘mark’ or ‘pixel-block’. • Features (e.g., height, width, area, position) for each pixel-block are then extracted. ‘Marks’ correspond roughly to letters, figures and punctuation symbols; • ligatures - • Multipart letters - (i, j), • multipart punctuation marks - (: ; ! ? %), • accented letters - • mathematical symbols - In these cases correspondence is not exact
(ii) Dictionary Search Searching a previously coded pixel block that matches the current pixel block can be done in following steps: • Prescreen the potential matching pixel-block, skipping it if features such as its width, its height, the area of its bounding box, or the number of black pixels are not close to those of the current pixel block. • Compute a match score, call the potential matching pixel block, the best match if its score is better than that of any other potential matching pixel block tested so far.
(iii) Coding of numerical data • If an acceptable match is found, the associated numerical data (dictionary index, position) are either bitwise or Huffman-based encoded.
(iv) Coding of bitmap (a) Coding a ‘mark’ with no acceptable match • The bitmap of the current pixel-block is encoded using MMR- or JBIG1-based techniques. (b) Coding a ‘mark’ with an acceptable match • Subsitute it with the matching block from the dictionary. (This can introduce substitution errors)
Two Encoding Methods • Pattern Matching and Substitution (PM Sub) • Soft Pattern Matching (Soft PM) The methods differ substantially in how they encode the ‘pixel blocks’
Comparison • In this method, a matching error can lead directly to a character substitution error. A PM&Sub method can neither guarantee that there will be no mismatches nor detect them when they occur. • In the Soft PM method, the matching pixel block is used only in the template, to improve the accuracy of our prediction of each pixel’s color. • Even using a totally mismatched pixel block in the template leads only to reduced compression efficiency, not to any errors in the final reconstructed image.
2) Soft Pattern Matching The Soft PM method, is similar to the lossy PM&Sub method. The difference is that lossy direct substitution of the matched character is replaced by a lossless encoding that uses the matched character in the coding context. The resultant ‘refined’ pixel-block may be identical to the original pixel block.
Steps in Soft PM Method - Initial steps are the same as in PM&Sub Method. Next, Lossless encoding of the bitmap of the current pixel block is as follows: • Align the geometric center of the current pixel block with the center of the matching pixel block. • Encode all of the pixels within the bounding box of the current block. Template for refinement coding : The ‘context’ for coding the pixel marked “P” is numbered 1 to 11. (a) Pixels taken from the causal part of the current pixel block. (b) Pixels taken from the matching pixel block.
Lossy Compression Increase compression ratio by extension to lossy compression. ( “lossless” and “lossy”?)
Lossy Preprocessing and Post-processing • Quantization of Offsets • Eliminating small marks • Noise Removal and Smoothing • Bit Flipping • Selective bit reversal with no matching ‘mark’
(i) Quantization of Offsets • For English text, character positions can be safely quantized to about 0.015 inch in the horizontal dimension and to about 0.01 inch in the vertical dimension; • Any more quantization causes noticeable distortion, but does not seriously affect legibility.
(ii)Eliminating small ‘marks’ • We get some improvement by eliminating very small marks that represent noise on the page. ignore only marks consisting of single black pixels surrounded by eight white pixels.
(iii)Noise Removal and Smoothing • Smoothing each ‘mark’ before compressing it and before entering it into the list of potential matching marks. • One simple smoothing that can be done is to remove single protruding pixels (white or black) along ‘mark’ edges. standardizing local edge shapes
(iv) Bit Flipping • Reverse isolated poorly predicted pixels The matching ‘mark’ is available
Selective bit reversal with no matching ‘mark’ • Since no corresponding pixel to serves as an approximation for the prediction of the current pixel. choose as candidates those differ from the pixel above and to the left. • Candidates are actually reversed only if they are `isolated‘ • After reversing the colour of the isolated poorly predicted pixels, we code the mark as in the lossless case, using the 10-pixel template of JBIG1
Lossless Compression after Lossy compression • Furthermore, Even after the image has been preprocessed and coded in a lossy fashion, it is possible to obtain a lossless coding with an acceptable compression ratio. The principle is to use the soft pattern matching method to encode the entire image