100 likes | 232 Views
CS26110 AI Toolbox. Evaluation. Today. Quick recap Evaluating classifiers. Training/test data. f(x 2 ) . x 2. x 6 (Headache). class attribute = decision/diagnosis = target function. x 7. f(x 7 )=? . Nearest neighbour algorithm. 1-NN: Given a test instance x m ,
E N D
CS26110AI Toolbox Evaluation
Today • Quick recap • Evaluating classifiers
Training/test data f(x2) x2 x6(Headache) class attribute = decision/diagnosis = target function x7 f(x7)=?
Nearest neighbour algorithm • 1-NN: Given a test instance xm, • First locate the nearest training example xn • Then f(xm):= f(xn) • k-NN: Given a test instance xm, • First locate the k nearest training examples • If target function = discrete then take vote among its k nearest neighbourselse take the mean of the f values of the k nearest neighbours
Evaluation • How can we evaluate classifiers?
Evaluation • Training data • Labelled data used to build a classifier • Test data • New data, not used in the training process, to evaluate how well a classifier does on new data • Memorization versus generalization • Better training accuracy: “memorizing” the training data • Better test accuracy: “generalizing” to new data • In general, we would like our classifier to perform well on new test data, not just on training data, • i.e., we would like it to generalize to new data
Considerations • Why is training accuracy not good enough? • Training accuracy is optimistic • A classifier like k-NN can construct boundaries which always separate all training data points, but which do not separate new points • E.g., what is the training accuracy of k-NN, k = 1? • A classifier can “overfit” the training data • In effect it just memorizes the training data, but does not learn the general relationship between x and f(x)
Hold-out set • The holdout set method • Randomly choose, say 30%, of the training data, set it aside • Train a classifier with the remaining 70% • Test the classifier’s accuracy on the 30% • Easy to compute, but has higher variance (not as good an estimator of future performance) • Data is wasted
Cross-validation • K-fold cross validation (CV) • Randomly divide training data into K chunks (folds) • For each fold: • Train a classifier with all training data except the fold • Test the classifier’s accuracy on the fold • K-fold CV accuracy = average of all K accuracies • One of the most popular methods (usually K=10) • Doesn’t entirely eliminate overfitting