Pattern Recognition: Statistical and Neural

Nanjing University of Science & Technology Pattern Recognition:Statistical and Neural Lonnie C. Ludeman Lecture 7 Sept 23, 2005

Review 1: Classifier Framework May be Optimum

Review 2: Classifier performance Measures 1. A’Posteriori Probability (Maximize) 2. Probability of Error ( Minimize) 3. Bayes Average Cost (Maximize) 4. Probability of Detection ( Maximize with fixed Probability of False alarm) (Neyman Pearson Rule) 5. Losses (Minimize the maximum)

Review 3: MAP and MPE Classification rule Form 1 C1 > If p( x | C1 ) P(C1 ) p( x | C2 ) P(C2 ) < C2 Form 2 C1 > If p( x | C1 ) / p( x | C2 ) P(C2 ) / P(C1 ) < C2 Likelihood ratio Threshold

Topics for Lecture 7 • Bayes Decision Rule – Introduction 2-Class case • 2. Bayes Decision Rule – Derivation 2-Class Case • 3. General Calculation of Probability of Error • 4. Calculation of Bayes Risk

Motivation ? Good Bad Select One Basket of Eggs Some good Some bad

Possible Decision Outcomes: Decide a good egg is good good egg is bad bad egg is bad bad egg is good No problem - Cost = 0 Throw away - Cost = 1 y Throw away – Cost = 0.1 y Catastrophy ! – Cost = 100 y

1. Bayes Classifier- Statistical Assumptions (Two Class Case) Known: C1 : x ~ p(x | C1) , P(C1) C2 : x ~ p(x | C2) , P(C2) Classes Observed Pattern Vector ConditionalProbabilityDensity Functions A’Priori Probabilities

Bayes Classifier - Cost definitions Define Costs associated with decisions: C11 C12 C21 C22 Where C = the cost associated with deciding Class C when true class Class C i j i j

Bayes Classifier - Risk Definition Risk is defined as the average cost associated with making a decision. R = Risk = P(decide C1 | C1) P(C1) C11 + P(decide C1 | C2) P(C2) C12 + P(decide C2 | C1) P(C1) C21 + P(decide C2 | C2) P(C2) C22

Bayes Classifier - Optimum Decision Rule Bayes Decision Rule selects regions R1 and R2, for deciding C1 and C2 respectively, to minimize the Risk, which is the average cost associated with making a decision. Can prove, details in book that the Bayes decision rule is a Likelihood Ratio Test (LRT) C1 p( x | C1) > (C22 - C12 ) P(C2) If = NBAYES < p( x | C2) (C11 - C21 ) P(C1) C2

Bayes Classifier - Calculation of Risk

Bayes Classifier - Special Case C11 = C22 = 0 cost of 0 for C12 = C21 =1 cost of 1 for Then Bayes Decision rule is equivalent to the Minimum Probability of Error Decision Rule correct classification incorrect classification

Since C1 p( x | C1) > (C22 - C12 ) P(C2) If = NBAYES < p( x | C2) (C11 - C21 ) P(C1) C2 Reduces to C1 p( x | C1) > (1 - 0) P(C2) If = NMPE < p( x | C2) (1 - 0) P(C1) C2

Bayes Decision Rule - Example Given the following Statistical Information p(x | C1) = exp(-x) u(x) P(C1) = 1/3 p(x | C2) = 2 exp(-2x) u(x) P(C2) = 2/3 Given the following Cost Assignment C11 = 0 C22 = 0 C12 = 3 C21 = 2 • Determine Bayes Decision Rule (Minimum Risk) • Simplify your test to the observation space • Calculate Bayes Risk for Bayes Decision Rule

Bayes Example – Solution is LRT C1 p( x | C1) > NBAYES If < p( x | C2) C2 (C22 - C12 ) P(C2) (0 - 3 ) 2/3 NBAYES = = = 3 (C11 - C21 ) P(C1) (0 - 2 ) 1/3 p( x | C1) exp(-x) u(x) = ½ exp(x) u(x) = p( x | C2) 2exp(-2x)

Bayes Example – Solution in different spaces (a) For x > 0 the Bayes Decision Rule is = C1 > In Likelihood Ratio Space If ½ exp(x) 3 < C2 (b) For x > 0 the equivalent decision rule in the observation space is seen to be = C1 In Observation Space > If x ln(6) < C2

Bayes Example – Calculation of Bayes Risk (c) Must compute the conditional probabilities of error P(error | C1) = P(decide C2 |C1) = p( x| C1 ) dx R2 ln(6) = exp(-x) u(x) 0 = 5/6

Bayes Example – Calculation of Bayes Risk (cont) P(error | C2) = P(decide C1 |C2) = p( x| C2 ) dx R1 o o = 2exp(-2x) u(x) ln(6) = 1/36

Bayes Example – Calculation of Bayes Risk (cont) Risk = 0 + P(decide C2 | C1) P(C1) C21 + P(decide C1 | C2) P(C2) C12 +0 = (5/6) (1/3) 2 + (1/36) (2/3) 3 Risk = 11/18 units /decision

2. General Calculation of Probability of Error F2 decide C2 F1 decide C1 R1 decide C1 y = g(x) y Feature Space x p(x | C1) L( x ) = Pattern Space p(x | C2) R2 decide C2 N = Threshold L1 decide C1 0 L1 decide C1 Likelihood Ratio Space

End of Lecture 7

Pattern Recognition: Statistical and Neural