300 likes | 309 Views
This course outline covers principles of generalization, survey of classifiers, discussion of the Rosch Pipeline for Prediction Imagery Representation Classifier, and topics such as bias and variance errors, overfitting, and the need for validation sets in reducing variance in models. It discusses the impact of features on bias and variance, how to measure complexity using VC dimension, and strategies to reduce variance through model parameterization and regularization. The course also delves into various classification methods like generative, discriminative, ensemble, instance-based, and unsupervised methods, highlighting their objectives, parameterization, training process, and inference. References touch on resources such as Tom Mitchell's "Machine Learning" and Christopher Bishop's "Neural Networks for Pattern Recognition" for further reading.
E N D
Classification Derek Hoiem CS 598, Spring 2009 Jan 27, 2009
Outline • Principles of generalization • Survey of classifiers • Project discussion • Discussion of Rosch
Pipeline for Prediction Imagery Representation Classifier Predictions
Bias and Variance Error High Bias Low Variance Complexity Low Bias High Variance
Overfitting • Need validation set • Validation set not same as test set
Bias-Variance View of Features • More compact = lower variance, potentially higher bias • More features = higher variance, lower bias • More independence among features = simpler classifier lower variance
How to reduce variance • Parameterize model E.g., linear vs. piecewise
How to measure complexity? • VC dimension Upper bound on generalization error Training error + N: size of training set h: VC dimension : 1-probability
How to reduce variance • Parameterize model • Regularize
How to reduce variance • Parameterize model • Regularize • Increase number of training examples
Effect of Training Size Error Number of Training Examples
Risk Minimization • Margins x x x x x x x o x o o o o x2 x1
Classifiers • Generative methods • Naïve Bayes • Bayesian Networks • Discriminative methods • Logistic Regression • Linear SVM • Kernelized SVM • Ensemble methods • Randomized Forests • Boosted Decision Trees • Instance based • K-nearest neighbor • Unsupervised • Kmeans
Components of classification methods • Objective function • Parameterization • Regularization • Training • Inference
Classifiers: Naïve Bayes • Objective • Parameterization • Regularization • Training • Inference y x1 x2 x3
Classifiers: Logistic Regression • Objective • Parameterization • Regularization • Training • Inference
Classifiers: Linear SVM • Objective • Parameterization • Regularization • Training • Inference x x x x x x x o x o o o o x2 x1
Classifiers: Linear SVM • Objective • Parameterization • Regularization • Training • Inference x x x x x x x o x o o o o x2 x1
Classifiers: Linear SVM • Objective • Parameterization • Regularization • Training • Inference Needs slack x x o x x x x x o x o o o o x2 x1
Classifiers: Kernelized SVM • Objective • Parameterization • Regularization • Training • Inference x x o o o x x x1 x x o o x12 o x x x1
Classifiers: Decision Trees • Objective • Parameterization • Regularization • Training • Inference x x x x x x o x o x o o o o x2 x1
Ensemble Methods: Boosting figure from Friedman et al. 2000
Boosted Decision Trees High in Image? Gray? Yes No Yes No Smooth? Green? High in Image? Many Long Lines? … Yes Yes No Yes No Yes No No Blue? Very High Vanishing Point? Yes No Yes No P(label | good segment, data) Ground Vertical Sky [Collins et al. 2002]
Boosted Decision Trees • How to control bias/variance trade-off • Size of trees • Number of trees
K-nearest neighbor • Objective • Parameterization • Regularization • Training • Inference x x o x x x x o x o x o o o o x2 x1
Clustering + x + o + + x x + x + + x + + o x x + o + o + o x2 x2 x1 x1
References • General • Tom Mitchell, Machine Learning, McGraw Hill, 1997 • Christopher Bishop, Neural Networks for Pattern Recognition, Oxford University Press, 1995 • Adaboost • Friedman, Hastie, and Tibshirani, “Additive logistic regression: a statistical view of boosting”, Annals of Statistics, 2000 • SVMs • http://www.support-vector.net/icml-tutorial.pdf