Ch 1. Introduction Pattern Recognition and Machine Learning, C. M. Bishop, 2006.

Ch 1. IntroductionPattern Recognition and Machine Learning, C. M. Bishop, 2006. Summarized by K.I. Kim Biointelligence Laboratory, Seoul National University http://bi.snu.ac.kr/

Contents • 1.1 Example: Polynomial Curve Fitting • 1.2 Probability Theory • 1.2.1 Probability densities • 1.2.2 Expectations and covariance • 1.2.3 Bayesian probabilities • 1.2.4 The Gaussian distribution • 1.2.5 Curve fitting re-visited • 1.2.6 Bayesian curve fitting • 1.3 Model Selection (C) 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/

Pattern Recognition • Training set, • Target vector, • Training (learning) phase • Determine • Generalization • Test set • Preprocessing • Feature selection (C) 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/

Supervised, Unsupervised and Reinforcement Learning • Supervised Learning: with target vector • Classification • Regression • Unsupervised learning: w/o target vector • Clustering • Density estimation • Visualization • Reinforcement learning: maximize a reward • Trade-off between exploration & exploitation (C) 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/

Model Selection & Over-fitting (2/2) • RMS(Root-Mean-Square) Error • Too large → Over-fitting • The more data, the better generalization • Over-fitting is a general property of maximum likelihood (C) 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/

Probability Theory • “What is the overall probability that the selection procedure will pick an apple?” • “Given that we have chosen an orange, what is the probability that the box we chose was the blue one?” (C) 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/

Bayesian Probabilities-Frequantist vs. Bayesian • Likelihood: • Frequantist • w: a fixed parameter determined by 'estimator‘ • Maximum likelihood: Error function = • Error bars: Obtained by the distribution of possible data sets • Bootstrap • Bayesian • a single data set • a probability distribution w: the uncertainty in the parameters • Prior knowledge • noninformative prior (C) 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/

Bayesian Probabilities-Expansion of Bayesian Application • Limited application of full Bayesian procedure • from 18th century • Marginalize over the whole of parameter space • Markov chain Monte Carlo • Small-scale problem • Highly efficient deterministic approximation schemes • e.g. variational Bayes, expectation propagation • Large-scale problem (C) 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/

Gaussian distribution-Example (1/2) • Getting unknown parameters • Data points are i.i.d. • Maximizing with respect to • sample mean: • Maximizing with respect to variance • sample variance: (C) 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/

Curve Fitting Re-visited (1/2) • Goal in the curve fitting problem • Prediction for the target variable t given some new input variable x • Determine the unknown w & by maximum likelihood (C) 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/

Model Selection • Proper model complexity → Good generalization & best model • Measuring the generalization performance • If data are plentiful, divide into training, validation & test set • Otherwise, cross-validate • Leave-one-out technique • Drawbacks • Expensive computation • Using separate data → multiple complexity parameters • New measures of performance • e.g. Akaike information criterion(AIC), Bayesian information criterion(BIC) (C) 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/

Ch 1. Introduction Pattern Recognition and Machine Learning, C. M. Bishop, 2006.

Ch 1. Introduction Pattern Recognition and Machine Learning, C. M. Bishop, 2006.

Presentation Transcript

Pattern Recognition and Machine Learning

Pattern Recognition and Machine Learning

Pattern Recognition and Machine Learning

PATTERN RECOGNITION AND MACHINE LEARNING

Ch 2. Probability Distribution Pattern Recognition and Machine Learning, C. M. Bishop, 2006.

Pattern Recognition and Machine Learning

Pattern Recognition and Machine Learning

Pattern Recognition and Machine Learning

Introduction to Machine Learning and Pattern Recognition

Pattern Recognition and Machine Learning

Ch 5. Neural Networks (1/2) Pattern Recognition and Machine Learning, C. M. Bishop, 2006.

Pattern Recognition and Machine Learning

Introduction to Pattern Recognition and Machine Learning

Ch 1. Introduction Pattern Recognition and Machine Learning, C. M. Bishop, 2006.

Ch 6. Kernel Methods Pattern Recognition and Machine Learning, C. M. Bishop, 2006.

Ch 13. Sequential Data (1/2) Pattern Recognition and Machine Learning, C. M. Bishop, 2006.

Ch 12. continuous latent variables Pattern Recognition and Machine Learning, C. M. Bishop, 2006.

Introduction to Pattern Recognition and Machine Learning

Ch 14. Combining Models Pattern Recognition and Machine Learning, C. M. Bishop, 2006.