340 likes | 486 Views
An Efficient Classification Approach Based on Grid Code Transformation and Mask-Matching Method. Presenter: Yo-Ping Huang. Outline. Introduction The proposed classification approach The coarse classification scheme The fine classification scheme Experimental results Conclusion.
E N D
An Efficient Classification Approach Based on Grid Code Transformation and Mask-Matching Method Presenter: Yo-Ping Huang
Outline • Introduction • The proposed classification approach • The coarse classification scheme • The fineclassification scheme • Experimental results • Conclusion
1. Introduction • Paper documents -> Computer codes • OCR(Optical Character Recognition) • The design of classification systems consists of two subproblems: • Feature extraction • Classification
Classification • Classification of objects (or patterns) into a number of predefined classes has been extensively studied in wide variety of applications such as • Optical character recognition (OCR) • Speech recognition • Face recognition
Feature extraction • Features are functions of the measurements that enable a class to be distinguished from other classes. • It has not found a general solution in most applications. • Our purpose is to design a general classification scheme, which is less dependent on domain-specific knowledge. • To do that, reliable and general features are required
Discrete Cosine Transform (DCT) • It helps separate an image into parts of differing importance with respect to the image's visual quality. • Due to the energy compacting property of DCT, much of the signal energy has a tendency to lie at low frequencies.
Four advantages in applying DCT • The features extracted by DCT are general and reliable. It can be applied to most of the vision-oriented applications. • The amount of data to be stored can be reduced tremendously. • Multiresolution classification and progressive matching can be achieved by nature. • The DCT is scale-invariant and less sensitive to noise and distortion.
Two philosophies of classification • Statistical • The measurements that describe an object are treated only formally as statistical variables, neglecting their “meaning” • Structural • Regard objects as compositions of structural units, usually called primitives.
Two stages of classification • Coarse classification • DCT • Grid code transformation (GCT) • Fine classification • Spatial domain • Template matching • Mask matching • Matching degree • Statistical matching Statistical mask-matching • Frequency domain
2. The proposed classification approach • The ultimate goal of classification is to classify an unknown pattern x to one of M possible classes (c1, c2,…, cM). • Each pattern is represented by a set of D features, viewed as a D-dimensional feature vector.
Training Elimination of DuplicatedCodes Calculate Mask Probability Grid CodeTransfor-mation Sorting Codes FeatureExtractionvia DCT Quanti-zation Prepro-cessing training pattern Statistical Mask Matching FeatureExtractionvia DCT Searching Candidates Grid CodeTransfor-mation final decision Quanti-zation Prepro-cessing test pattern candidates Fine Classification Coarse Classification Figure 1. The framework of our classification approach.
In the training mode: • GCT • Positive mask • Negative mask • Mask probability • In the classification mode: • GCT (coarse classification) • Statistical mask matching (fine classification)
3. The coarse classification scheme • Feature extraction via DCT • The DCT coefficients F(u, v) of an N×N image represented by x(i, j) can be defined as where
Figure 2. The DCT coefficients of the character image of “為”.
Grid code transformation (GCT) • Quantization • The 2-D DCT coefficient F(u,v) is quantized to F’(u,v) according to the following equation: • Thus, dimension of the feature vector can be reduced after quantization.
The features of each training sample are first extracted by DCT and quantized. • The most D significant are quantized and transformed to a code, called grid code (GC). • Given a sample Oi, it is quantized into a feature vector in form of [qi1, qi2, .., qiD].
The items are sorted in a zigzag order: F(0,0), F(0,1), F(1,0), F(2,0), F(1,1), F(0,2), F(0,3), F(1,2), F(2,1), F(3,0), F(3,1),…, and so on. • This order is derived from the energy compacting property that low-frequency DCT coefficients are often more important than high-frequency ones. • In this way, object Oican be transformed to a D-digit GC.
Grid code sorting and elimination • All the training samples are transformed into a list of triplets (Ti, Ci, GCi) by GCT • Ti is the ID of a training sample • Ci is the class ID • GCiis the grid code of the training sample. • The list has to be sorted ascendingly according to the GCs. • Redundancy might occur as the training samples belonging to the same class have the same GC.
In summary, the information about the classes within each GC is gathered in the training phase. • In the test phase, on classifying a test sample, a reduced set of candidate classes can be retrieved from the lookup table according to the GC of the test sample.
4. The fine classification scheme • Mask Generation • A kind of the template matching method • The border bits are unreliable • Find out those bits that are reliably black (or white).
Figure 3. Mask generation (a) (b) (c) • Superimposed characters of “佛”, • the positive mask of “佛”, and • the negative mask of “佛”.
Bayes’ classification P(ci | x): the probability of x in class i when x is observed. P(x | ci): the probability of the feature being observed when the class is present. P(ci): the probability of that class being present. P(x): the probability of feature x.
Measures for mask matching • The degree of matching between an unknown character x and the positive mask ofclass i, , can be defined by: Nb( f ): the number of black bits in bitmap f. Mb(f, g): the number of black bits with the same positions in both f and g. • Similarly,
Def. 1. If x matches to the positive mask of class i at the degree of a, i.e., It is called xa-match the positive mask of class i, and denoted by . • Def.2. If x matches to the negative mask of class i at the degree of b, i.e., It is called xb-match the negative mask of class i, and denoted by .
Statistical mask-matching • The probability of x in class i when is observed can be described by • Similarly, we get
Statistical decision rule • Rule AMP (Average Matching Probability)
5. Experimental Results • A famous handwritten rare book, Kin-Guan bible (金剛經) • 18,600 samples. • 640 classes.
Figure 4. Reduction and accuracy rate using our coarse classification scheme. The best value of D is 6.
Figure 5. Accuracy rate using both coarse and fine classification. Good reduction rate would not sacrifice the performance of fine classification.
Figure 6. Accuracy rate using both coarse and fine classification under different values of AMP.
6. Conclusions • This paper presents a two-stage classification approach for vision-based applications. • The first stage is coarse classification, which employs DCT to extract features for each character image. • The grid code transformation (GCT) method is further applied to quantize the most significant DCT coefficients into a finite number of grids.
The second stage is fine classification, which uses a statistical mask-matching method to identify the individual target in the set given by the first stage. • The statistical mask-matching method is proved to be effective in recognizing the Chinese handwritten characters.
The experimental results show that: • The good reduction rate provided by coarse classification would not sacrifice the performance of fine classification; • The more confident the decision, the better the accuracy rate is. • By selecting features of strong confidence, classification accuracy could be further improved.