CS26110 AI Toolbox

CS26110AI Toolbox Evaluation

Today • Quick recap • Evaluating classifiers

Training/test data f(x2) x2 x6(Headache) class attribute = decision/diagnosis = target function x7 f(x7)=?

Nearest neighbour algorithm • 1-NN: Given a test instance xm, • First locate the nearest training example xn • Then f(xm):= f(xn) • k-NN: Given a test instance xm, • First locate the k nearest training examples • If target function = discrete then take vote among its k nearest neighbourselse take the mean of the f values of the k nearest neighbours

Evaluation • How can we evaluate classifiers?

Evaluation • Training data • Labelled data used to build a classifier • Test data • New data, not used in the training process, to evaluate how well a classifier does on new data • Memorization versus generalization • Better training accuracy: “memorizing” the training data • Better test accuracy: “generalizing” to new data • In general, we would like our classifier to perform well on new test data, not just on training data, • i.e., we would like it to generalize to new data

Considerations • Why is training accuracy not good enough? • Training accuracy is optimistic • A classifier like k-NN can construct boundaries which always separate all training data points, but which do not separate new points • E.g., what is the training accuracy of k-NN, k = 1? • A classifier can “overfit” the training data • In effect it just memorizes the training data, but does not learn the general relationship between x and f(x)

Hold-out set • The holdout set method • Randomly choose, say 30%, of the training data, set it aside • Train a classifier with the remaining 70% • Test the classifier’s accuracy on the 30% • Easy to compute, but has higher variance (not as good an estimator of future performance) • Data is wasted

Cross-validation • K-fold cross validation (CV) • Randomly divide training data into K chunks (folds) • For each fold: • Train a classifier with all training data except the fold • Test the classifier’s accuracy on the fold • K-fold CV accuracy = average of all K accuracies • One of the most popular methods (usually K=10) • Doesn’t entirely eliminate overfitting

Demo...

CS26110 AI Toolbox