Pattern Recognition: Statistical and Neural

Nanjing University of Science & Technology Pattern Recognition:Statistical and Neural Lonnie C. Ludeman Lecture 16 Oct 19, 2005

Lecture 16 Topics 1. Minimum Distance Classifiers-Examples 2. Similarity Measures 3. Perceptron Algorithm- Example

Minimum Distance Classifiers Given a set of prototypes pi associated with class Cifor i=1,2, … , K Decision Rule:

Since d(x,pj) is the Smallest decide xfromCj pK CK d(x,pK) d(x,pK-1) pK-1 CK-1 x Point to be classified d(x,pj) d(x,p1) pj Cj d(x,p2) p1 C1 p2 C2 pj = jth prototype associated with Cj

Two Class Case-Single prototype per class C1 : p1 = [p11, p12] Given: C2 : p2 = [p21, p22] Boundary: set d(x,p1) = d(x, p2) Squaring and simplifying the boundary is a straight line as follows

Decision Regions and Boundary p1 Decide C2 x Decide C1 p2 d(x,p1) = d(x,p2) Boundary is perpendicular bisector

K-class case- single prototype per class Decision boundaries are hyperplanes! Decision regions Hyperpolyhedra.

Example: Given Protypes p1, p2, and p3 for C1, C2, and C3 (a) Classify x classified as C1 since d(x, p1) is the smallest (b) Show Decision regions in pattern space

(b) Solution: Decision Boundaries d(x,p1) = d(x, p2) d(x,p1) = d(x, p3) d(x,p2) = d(x, p3) 2x1- 4x2 = 3

Decision regions and Boundaries x1 + x2 = 1 2x1 - 4x2 = 3 4x1 - 2x2 = 5

Example: Repeat using min distance measure (a) Therefore classify as C1 since dmin(x,p1) is the smallest (same result for this case) (b) Find decision regions and boundaries using min distance function. Decision regions and boundaries quite different.

(b) Decision Boundaries and regions Decision Boundaries from Decision Regions as follows

Decision Regions and Boundaries using dmin R1 R3 R2 R2 R2 R3 R1 R1 R1 R2 R3 R3 R2 R1

Minimum Distance Multiple Prototypes Given Prototypes: pj(i) is the jth prototype for the class Ci Classification Rule:

Example: Minimum distance – Multiple Prototypes Given the following prototypes for the classes C1, C2, and C3 Find the decision regions R1, R2, and R3for the nearest neighbor decision rule

Solution: using deuc (x,y) distance function Boundaries described by perpendicular bisectors of lines between different class prototypes

Similarity Functions - Examples: Tanimoto Coefficient Similarity functions do not satisfy all the properties of distance functions

from C1 from C2 Motivation Separating Hyperplane

from C1 from C2 Motivation

Question: How do we find separating Hyperplane??? a “needle in the haystack” Answer: The Perceptron Algorithm !!!! Other ways exist like random selection

The Perceptron Algorithm Step (1) Randomly select a starting weight vector w(1). Set k=1 and Go to step (2).

The Perceptron Algorithm Step (1) Randomly select a starting weight vector w(1). Set k=1 and Go to step (2). Step (2) Select the kth training pattern x(k) and compute wT(k)x(k) and go to step (3)

The Perceptron Algorithm Step (1) Randomly select a starting weight vector w(1). Set k=1 and Go to step (2). Step (2) Select the kth training pattern x(k) and compute wT(k)x(k) and go to step (3) Step (3) Adjust weight vector if necessary as follows Make Adjustment if incorrectly classified if wT(k)x(k) < 0 AND x(k) , C1 then w(k+1) = w(k) + c x(k) if wT(k)x(k) > 0 AND x(k) , C2 then w(k+1) = w(k) - c x(k) = =

The Perceptron Algorithm Step (1) Randomly select a starting weight vector w(1). Set k=1 and Go to step (2). Step (2) Select the kth training pattern x(k) and compute wT(k)x(k) and go to step (3) Step (3) Adjust weight vector if necessary as follows Make Adjustment if incorrectly classified if wT(k)x(k) < 0 AND x(k) , C1 then w(k+1) = w(k) + c x(k) if wT(k)x(k) > 0 AND x(k) , C2 then w(k+1) = w(k) - c x(k) No Adjustment if correctly classified if wT(k)x(k) > 0 AND x(k) , C1 then w(k+1) = w(k) if wT(k)x(k) < 0 AND x(k) , C2 then w(k+1) = w(k) Go to step (4) = =

Perceptron Algorithm Continued Step (4) If w(k) not changed through one entire pass through the set of pattern vectors then wT(k)x = 0 is a separating hyperplane. Algorithm stops If w(k) was changed at any iteration through a continuous pass through the training set then increment k (k=k+1). If k > Nmax , where Nmax is the maximum number of iterations that the user is willing to wait, the process is stopped and user informed that NO separating hyperplane was found** in the allotted time, otherwise return to step(2). ** If Algorithm stops it does not mean that a separating hyperplane does not exist, it means simply we have not found one in the alotted time.

Flow Diagram - +

Interpretration of Perceptron results 1. Results not Unique: Different initial condition could lead to different boundary 2. If find weight vector then patterns are linearly separable 3. If do not find weight vector, i.e. stop at NMAX, then we do not know whether the patterns are linearly separable. We only know we did not find one.

Example: The Perceptron Algorithm Given the following Training Samples known to be from Class C1 and C2 x2 1 -1 x1 1 -1 Find a linearly separating hyperplane by using the Perceptron Algorithm

Solution: using the Perceptron Algorithm Define the augmented Pattern vectors for the two classes as follows The following initial weight vector wT(1) is selected randomly

Iteration 1

Iteration 2

Iteration 3

Iteration 4

Iteration 5

Iteration 6

Iterations 7-14 weight vector unchanged through one pass of training samples Thus w(10) is a weight vector which gives the following separating HYPERPLANE Answer

Thus the following g(x) could be used as a discriminant function With decision rule as follows

Trace of boundaries for trial weight vectors as iteration changes

Lecture 16 Summary 1. Minimum Distance Classifiers-Examples 2. Similarity Measures 3. Perceptron Algorithm- Example,

End of Lecture 16

Pattern Recognition: Statistical and Neural