200 likes | 235 Views
Ch 1. Introduction Pattern Recognition and Machine Learning, C. M. Bishop, 2006. Summarized by J.W. Ha Biointelligence Laboratory, Seoul National University http://bi.snu.ac.kr/. Contents. 1.4 The Curse of Dimensionality 1.5 Decision Theory 1.6 Information Theroy.
E N D
Ch 1. IntroductionPattern Recognition and Machine Learning, C. M. Bishop, 2006. Summarized by J.W. Ha Biointelligence Laboratory, Seoul National University http://bi.snu.ac.kr/
Contents • 1.4 The Curse of Dimensionality • 1.5 Decision Theory • 1.6 Information Theroy (C) 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/
1.4 The Curse of Dimensionality • The High Dimesionality Problem • Ex. Mixture of Oil, Water, Gas - 3-Class (Homogeneous, Annular, Laminar) - 12 Input Variables - Scatter Plot of x6, x7 - Predict Point X - Simple and Naïve Approach (C) 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/
1.4 The Curse of Dimensionality (Cont’d) • The Shortcomings of Naïve Approach - The number of cells increase exponentially. - Needs a large training data set for cells not to be empty. (C) 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/
1.4 The Curse of Dimensionality (Cont’d) • Polynomial Curve Fitting Method(M Order) - Althogh D increases, it grows propotionally to Dm • The Volume of High Dimensional Sphere - Concentrated in a thin shell near the space (C) 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/
1.4 The Curse of Dimensionality (Cont’d) • Gaussian Distribution (C) 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/
1.5 Decision Theory • Make Optimal Decisions - Inferrence Step & Decision Step - Select Higher Posterior Probability • Minimizing the Misclassification Rate • MAP • → Minimizing Colored Area (C) 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/
1.5 Decision Theory (Cont’d) • Minimizing the Expected Loss - Class마다 Missclassification의 Damage가 다르다. - Introduction of Loss Function(Cost Function) • MAP • → Minimizing Expected Loss • The Reject Option • Threshold θ • Reject if θ > Posterior Prob. (C) 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/
1.5 Decision Theory (Cont’d) • Inference and Decision - Three Distinct Approach 1. Obtain Posterior Probability & Generative Models 2. Obtain Posterior Probability & Discriminative Models 3. Find Discrimitive Function (C) 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/
1.5 Decision Theory (Cont’d) • The Reason to Compute the Posterior 1. Minimizing Risk 2. Reject Option 3. Compensating for Class Priors 4. Combining Models (C) 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/
1.5 Decision Theory (Cont’d) • Loss Function for Regression - Multiple Target Variable Vector (C) 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/
1.5 Decision Theory (Cont’d) • Minkowski Loss (C) 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/
1.6 Information Theory • Entropy - The noiseless coding theorem states that the entropy is lower bound on the number of bits needed to transmit the state of a random variable. - Higher Entropy, Lager Uncertainty (C) 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/
1.6 Information Theory (Cont’d) • Maximum Entropy Configuration for Continuous Variable - Constraints - Result - The distribution that maximize the differential entropy is the Gaussian • Conditional Entropy : H[x,y] = H[y|x] + H[x] (C) 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/
1.6 Information Theory (Cont’d) • Relative Entropy [Kullback-Leibler divergence] • Convexity Function (Jensen’s Inequality) (C) 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/
1.6 Information Theory (Cont’d) • Mutual Information - I[x, y] = H[x] – H[x|y] = H[y] – H[y|x] - If x and y are independent, I[x,y] = 0 - the Reduction in the uncertainty about x by virtue of being told the value of y (C) 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/