90 likes | 110 Views
Pattern Recognition and Training. Pattern -- set of values (known size) that describe things The general problem Approaches to the decision-making process 1. Simple comparison 2. Common property 3. Clusters (using distance measurement | X 1 - u 1 | + | X 2 -u 2 | + … + | X n -u n | )
E N D
Pattern Recognition and Training • Pattern -- set of values (known size) that describe things • The general problem • Approaches to the decision-making process 1. Simple comparison 2. Common property 3. Clusters (using distance measurement |X1-u1| + |X2-u2| + … + |Xn-un| ) 4. Combination of 1, 2 and 3 240-373 Image Processing
Decision Functions • Decision function:w = (w1, w2, w3, …, wn) • If the pattern vector is x = [x1, x2, x3, …, xn, 1]T, then • The unknown pattern is in group B if wTx > 0 • The unknown pattern is in group A if wTx <= 0 • Example: (8,4) is in group B because [1.5, -1.0, -3.5] [8, 4, 1]T = 8x1.5-4-3.5 = 4.5 and 4.5 > 0 • How about (4,4)? 240-373 Image Processing
Decision Functions (Cont’d) • The number of groups can be more than 2 • Decision table Result of w1 Result of w2 Implication < 0 < 0 no group < 0 > 0 group A > 0 < 0 group C > 0 > 0 group B • Decision function need not be a linear function 240-373 Image Processing
Cluster Means • If the cluster consists of [3,4,8,2] [2,9,5,1][5,7,7,1], then the mean is [3.33, 6.67, 6.67, 1.33]. This represents the center of the four-dimensional cluster. • The Euclidean distance from the center to a new pattern can be calculated as follows: new vector [3, 5, 7, 0], Euclidean distance = (3-3.33)2 + (5-6.67)2 + (7-6.67)2 + (0-1.33)2 = 4.78 240-373 Image Processing
Automatic Clustering Technique 1:K-means clustering USE: To automatically find the best groupings and means of K clusters. OPERATION: • The pattern vectors of K different items are given to the system • Classifying them as best it can (without knowing which vector belongs to which item) • Let the pattern vectors be X1, …, Xn • Take the first K points as the initial estimation of the cluster means M1 = X1, M2 = X2, …, Mk = Xk * Allocate each pattern vector to the nearest group (minimum distance) • Calculate new cluster centers • If they are the same as the old centers, then STOP, other wise goto step * 240-373 Image Processing
K-means clustering example M1 = (2, 5.0) M2 = (2, 5.5) Allocating each pattern vector to the nearest center gives 1 (2,5.0) group 1 2 (2,5.5) group 2 3 (6,2.5) group 1 4 (7,2.0) group 1 5 (7,3.0) group 1 6 (3,4.5) group 1 The group means now become group 1: M1 = (5, 3.4) group 2: M2 = (2, 5.5) 240-373 Image Processing
This gives new groupings as follows: 1 (2,5.0) group 2 2 (2,5.5) group 2 3 (6,2.5) group 1 4 (7,2.0) group 1 5 (7,3.0) group 1 6 (3,4.5) group 2 And the group means become group 1: M1 = (6.67, 2.5) group 2: M2 = (2.33, 5.0) Groupings now stay the same and the processing stops. 240-373 Image Processing
Optical Character Recognition Technique: Isolation of a character in an OCR document USE: To create a window containing only one character onto an array containing a text image OPERATION: 1. Assuming that the image is correctly oriented and the text is dark on a white background 2. Calculate row sums of the pixel gray-level values. High row sums indicate a space between the rows 3. Calculate column sums of the pixel gray-level values. High column sums indicate a space between the columns 240-373 Image Processing
Technique: Creating the pattern vector (feature extraction) USE: To create the pattern vector for a character so that it can be compared with the library OPERATION: 1. Assuming that the character has been isolated 2. Place a 4x4 grid over the image and count the number of “ink” pixels in each grid. 3. These number are then divided by the total number of pixels in the grid 4. Comparing resulting numbers with the library 240-373 Image Processing