Understanding Pattern Recognition and Decision Theory in Machine Learning

Ch 1. IntroductionPattern Recognition and Machine Learning, C. M. Bishop, 2006. Summarized by J.W. Ha Biointelligence Laboratory, Seoul National University http://bi.snu.ac.kr/

1.4 The Curse of Dimensionality • The High Dimesionality Problem • Ex. Mixture of Oil, Water, Gas - 3-Class (Homogeneous, Annular, Laminar) - 12 Input Variables - Scatter Plot of x6, x7 - Predict Point X - Simple and Naïve Approach (C) 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/

1.4 The Curse of Dimensionality (Cont’d) • The Shortcomings of Naïve Approach - The number of cells increase exponentially. - Needs a large training data set for cells not to be empty. (C) 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/

1.4 The Curse of Dimensionality (Cont’d) • Polynomial Curve Fitting Method(M Order) - Althogh D increases, it grows propotionally to Dm • The Volume of High Dimensional Sphere - Concentrated in a thin shell near the space (C) 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/

1.5 Decision Theory • Make Optimal Decisions - Inferrence Step & Decision Step - Select Higher Posterior Probability • Minimizing the Misclassification Rate • MAP • → Minimizing Colored Area (C) 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/

1.5 Decision Theory (Cont’d) • Minimizing the Expected Loss - Class마다 Missclassification의 Damage가 다르다. - Introduction of Loss Function(Cost Function) • MAP • → Minimizing Expected Loss • The Reject Option • Threshold θ • Reject if θ > Posterior Prob. (C) 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/

1.5 Decision Theory (Cont’d) • Inference and Decision - Three Distinct Approach 1. Obtain Posterior Probability & Generative Models 2. Obtain Posterior Probability & Discriminative Models 3. Find Discrimitive Function (C) 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/

1.5 Decision Theory (Cont’d) • The Reason to Compute the Posterior 1. Minimizing Risk 2. Reject Option 3. Compensating for Class Priors 4. Combining Models (C) 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/

1.6 Information Theory • Entropy - The noiseless coding theorem states that the entropy is lower bound on the number of bits needed to transmit the state of a random variable. - Higher Entropy, Lager Uncertainty (C) 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/

1.6 Information Theory (Cont’d) • Maximum Entropy Configuration for Continuous Variable - Constraints - Result - The distribution that maximize the differential entropy is the Gaussian • Conditional Entropy : H[x,y] = H[y|x] + H[x] (C) 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/

1.6 Information Theory (Cont’d) • Mutual Information - I[x, y] = H[x] – H[x|y] = H[y] – H[y|x] - If x and y are independent, I[x,y] = 0 - the Reduction in the uncertainty about x by virtue of being told the value of y (C) 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/

Understanding Pattern Recognition and Decision Theory in Machine Learning

Understanding Pattern Recognition and Decision Theory in Machine Learning

Presentation Transcript

Pattern Recognition and Machine Learning

Pattern Recognition and Machine Learning

Pattern Recognition and Machine Learning

PATTERN RECOGNITION AND MACHINE LEARNING

Ch 2. Probability Distribution Pattern Recognition and Machine Learning, C. M. Bishop, 2006.

Pattern Recognition and Machine Learning

Pattern Recognition and Machine Learning

Pattern Recognition and Machine Learning

Introduction to Machine Learning and Pattern Recognition

Pattern Recognition and Machine Learning

Ch 1. Introduction Pattern Recognition and Machine Learning, C. M. Bishop, 2006.

Ch 5. Neural Networks (1/2) Pattern Recognition and Machine Learning, C. M. Bishop, 2006.

Pattern Recognition and Machine Learning

Introduction to Pattern Recognition and Machine Learning

Ch 6. Kernel Methods Pattern Recognition and Machine Learning, C. M. Bishop, 2006.

Ch 13. Sequential Data (1/2) Pattern Recognition and Machine Learning, C. M. Bishop, 2006.

Ch 12. continuous latent variables Pattern Recognition and Machine Learning, C. M. Bishop, 2006.

Introduction to Pattern Recognition and Machine Learning

Ch 14. Combining Models Pattern Recognition and Machine Learning, C. M. Bishop, 2006.