1 / 31

Pattern Recognition: Statistical and Neural

Comprehensive review of classifier performance measures including a posteriori probability, Bayes decision rules, and error minimization in the context of pattern recognition.

velmaa
Download Presentation

Pattern Recognition: Statistical and Neural

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Nanjing University of Science & Technology Pattern Recognition:Statistical and Neural Lonnie C. Ludeman Lecture 9 Sept 28, 2005

  2. Review 1: Classifier performance Measures 1. A’Posteriori Probability (Maximize) 2. Probability of Error ( Minimize) 3. Bayes Average Cost (Maximize) 4. Probability of Detection ( Maximize with fixed Probability of False alarm) (Neyman Pearson Rule) 5. Losses (Minimize the maximum)

  3. Review 2: MAP, MPE , and Bayes Classification Rule C1 > If l( x ) N < C2 Threshold Likelihood ratio P(C2) P(C2) NMAP = NMPE = P(C1) P(C1) (C22 - C12 ) P(C2) NBAYES = (C11 - C21 ) P(C1)

  4. Review 3. General Calculation of Probability of Error F2 decide C2 F1 decide C1 R1 decide C1 y = g(x) y Feature Space x p(x | C1) L( x ) = Pattern Space p(x | C2) R2 decide C2 N = Threshold L1 decide C1 0 L1 decide C1 Likelihood Ratio Space

  5. Topics for Lecture 9 • Neyman Pearson Decision Rule Receiver Operating Characteristic(ROC) • M-Class Case MAP Decision Rule • M-Class Case MPE Decision Rule • 4. M-Class Bayes Decision Rule

  6. Motivation: Falling Rock Small probability of a falling rock Difficult to assign realistic costs to consequences Very High cost to not detect Low cost for false alarm

  7. Definitions: P(decide target | target) Detection P(decide no target | target) Miss P(decide target | no target) False Alarm P(decide no target | no target ) Correct Dismissal

  8. Neyman Pearson Classifier- 2 Classes A. Assumptions: C1: (target), known p(x | C1) C2: (no Target),known p(x | C2) NoApriori probabilities specified Acceptable False Alarm rate specified No cost assignment available specified

  9. B. Performance: Probability of Detection and Probability of False Alarm PD = P(decide target | target is present) PFA = P(decide target | when target C Decision rule: Want to Maximize the probability of Detection for an acceptable False alarm rate is NOT present)

  10. Neyman Pearson Decision Rule – rough derivation 0 0 PD= P (decide target | target) = p(x | C1 ) dx = 1 - PM R1 PFA= P( decide target | no target) = p(x | C2 ) dx R1 < PFA= Acceptable false alarm rate Use Lagrangian Multipliers to minimize J as follows J = PM + (PFA - )

  11. Neyman Pearson Decision Rule – rough derivation 0 0 J = 1 - p(x | C1 ) dx + ( p(x | C2 ) dx - ) = 1 - + [ - p(x | C1 ) + p(x | C2 ) ] dx R1 R1 R1 To minimize J we select x to be in R1 if the term in [ … ] is negative. x is assigned to R1 if - p(x | C1 ) + p(x | C2 ) < 0 which can be rearranged as follows

  12. Neyman Pearson Decision Rule 0 C1 p(x | C1 ) > If = NNP < p(x | C2 ) C2 where is the solution of the constraining equation = p(x | C2 ) dx R1( )

  13. Receiver Operating Characteristic (ROC) PD Always Say Target 1 Slope = NNP pD (pD , pFA ) Operating Point 0 pFA 1 PFA Always Say NO Target

  14. Extension of MAP, MPE, & Bayes to M Classes Shorthand Notation for M Class case: C1 : x ~ p(x | C1) , P(C1) C2 : x ~ p(x | C2) , P(C2) CM : x ~ p(x | CM) , P(CM)

  15. Maximum A’Posteriori Classification Rule (M Class Case ) A. Basic Assumptions: Know : Conditional Probability Density functions pX( x | C1), pX( x | C2 ), … , pX( x | CM ). Know : A’Priori Probabilities P( C1 ), P( C2 ), … , P( CM ) B. Performance Measure:A’posteriori Probability P( Ci | x )

  16. Maximum A’Posteriori Classification Rule (M Class Case ) C. Decision Rule for an observed vector x, Selects class with Maximum Aposterioi Probabability. if P(Ci | x) > P(Cj | x ) for all j = 1, 2, … , M then decide x from C1 if equality then decide x from the boundary classes by random choice j = i

  17. Derivation of MAP Decision Rule Determine for i = 1, 2, … , M the aposteriori probabilities P(Ci| x ) Use One form of Bayes Theorem P(Ci | x ) = p( x | Ci ) P(Ci ) / p( x ) Substitute the above for the P(Ci| x) to give p( x | Ci ) P(Ci ) / p( x ), i = 1, 2, … , M But p( x ) is the same for all terms so the decision rule simplifies to

  18. MAP Decision Rule for an observed vector x Select class Ci if p( x | Ci ) P(Ci ) > p( x | Cj ) P(Cj ) for all j = 1, 2, … , M j = i if equality then decide x from the boundary classes by random choice

  19. 2. Minimum Probability of Error Classification Rule (M Class Case ) A. Basic Assumptions: Known conditional probability density functions p(x | C1), p(x | C2), … , p(x | CM) Known a’priori probabilities P(C1), P(C2), … , P(CM) B. Performance: (Total Probability of Error) P(error) = p(error | C1) P(C1) + P(error | C2) P(C2) + … + P(error | CM) P(CM) C. Decision Rule:Minimizes P(error)

  20. 2. Derivation: Minimum Probability of Error Classification Rule (M Class Case ) Selects decision regions such that P(error) is minimized Decide C2 Decide C1 R2 R1 Ri RM Decide Ci Decide CM Pattern Space X But P(error) = 1 – P(correct) where P(correct) = P(correct | C1) P(C1) + P(correct | C2) P(C2) + … + P(correct | CM) P(CM)

  21. P(correct | C1)= P (decide C1 | C1) = p(x | C1 ) dx Derivation Continued R1 P(correct | Ck)= P (decide Ck | Ck) = p(x | Ck ) dx Rk … P(correct | CM)= P (decide CM | CM) = p(x | CM ) dx RM

  22. Derivation Continued Rk M P(error) = 1 - p(x | Ck )P(Ck) dx k=1 The Minimum Probability of error decision rule selects Rk k=1, 2, … , M such that the P(errror) is minimized. By selecting xto be a member of Rk if the term p(x | Ck )P(Ck) is the MAXIMUM all others we will minimize P(errror).

  23. Thus MPE Decision Rule for an observed vector x Select class Ck if p( x | Ck ) P(Ck ) > p( x | Cj ) P(Cj ) for all j = 1, 2, … , M j = k if equality then decide x from the boundary classes by random choice

  24. Bayes Classifier- (M Class Case) A: Statistical Assumptions Known: C1 : x ~ p(x | C1) , P(C1) C2 : x ~ p(x | C2) , P(C2) Ck : x ~ p(x | Ck) , P(Ck) CM : x ~ p(x | CM) , P(CM) … … ConditionalProbabilityDensity Functions Classes Observed Pattern Vector A’Priori Probabilities

  25. Bayes Classifier - Cost definitions Define Costs associated with decisions: C11 , C12 , … , C1M C21 , C22 , … , C2M CM1 , CM2 , … , CMM … Where C = the cost associated with deciding Class C when true class Class C i j i j

  26. Rk Bayes Classifier -Risk Definition M-Class Case Risk is defined as the average cost associated with making a decision. M M R =Risk = P(decide Ci | Cj) P(Cj) Cij i=1 j=1 P(decide Ck | Cj ) = p(x | Cj ) dx

  27. R1 R2 Derivation Continued M Risk = C1j p(x | Cj) P(Cj) j=1 M C2j p(x | Cj) P(Cj) + j=1 … M + CMj p(x | Cj) P(Cj) j=1 RM

  28. Bayes Decision Rule: M-Class Case M yi(x) = Cijp(x | Cj) P(Cj) j=1 To MINIMIZE risk we would assign x to the region Riif yi(x) < yj(x) for all j = i

  29. Bayes Decision Rule: M-Class Case Final Step of Derivation M yi(x) = Cijp(x | Cj) P(Cj) j=1 if yi(x) < yj(x) for all j = i Then decide x is from Ci

  30. Summary • Neyman Pearson Decision Rule Receiver Operating Characteristic(ROC} • M-Class Case MAP Decision Rule • M-Class Case MPE Decision Rule • 4. M-Class Bayes Decision Rule

  31. End of Lecture 9

More Related