CS 236501 Introduction to AI

CS 236501Introduction to AI Tutorial 9 ID3

? + + + + + - - - Learning Unlabeled example Learner Classifier Training Set +/- Label We aim to produce an accurate classifier Intro. to AI – Tutorial 9 – By Nela Gurevich

Example: Play Tennis • We want to learn the concept: “A good day to play tennis” • Examples to be used for learning: Classification (label) Attributes Examples Intro. to AI – Tutorial 9 – By Nela Gurevich

Decision Trees A Node represents an attribute Possible attribute values Outlook Sunny Overcast Rain Humidity Wind YES High Normal Strong Weak YES NO NO YES Leaves contain classifications (Outlook = Sunny, Temperature = High, Humidity = High, Wind = Weak) → PlayTennis = NO Intro. to AI – Tutorial 9 – By Nela Gurevich

+ + + + + + - - - - - - + + + + Building a decision tree • Building a decision tree, given a group of labeled examples (training set): • Choose an attribute A • Split the examples according to the values of A • Build trees for sons recursively, stop splitting when all examples at a node have the same labels A A = a1 A = a2 A = a3 + … … Intro. to AI – Tutorial 9 – By Nela Gurevich

ID3 • ID3 is an algorithm for building decision trees • ID3 uses information gain to select the best attribute for splitting Intro. to AI – Tutorial 9 – By Nela Gurevich

+ + + + + - - - Decision trees and ID3 ID3 Decision tree Learner Classifier Training Set Intro. to AI – Tutorial 9 – By Nela Gurevich

+ + + + + + + + + + + + + + + + + - - - - - - - - - - - - - - - Information Gain High Uncertainty High Uncertainty Low Uncertainty Good Split Intro. to AI – Tutorial 9 – By Nela Gurevich

Information Gain • Where • p, n: number of positive/negative examples at a node • I(p,n): uncertainty given p and n • Va: number of possible values for attribute A • Ei: number of examples at son i • ID3 chooses an attribute with the highest gain for splitting Intro. to AI – Tutorial 9 – By Nela Gurevich

Attribute Types:An Attribute with Discrete Values • The domain of attribute A is discrete: DomainA = {blue, green, yellow} • Splitting is simple: create a son for each possible value of A Intro. to AI – Tutorial 9 – By Nela Gurevich

Attribute types:An attribute with continuous values • The domain of attribute A is continuous: DomainA = {1 - 100} • How to split? • Suggestion: make the domain discrete: DomainA = {1 – 30, 30 – 40, 40 – 100} • Problems: • Which discretization is good? • We will not be able to distinguish between examples in the same range • Example: if A represents grades, there will be no difference between students with grades within the range 40 -100 Intro. to AI – Tutorial 9 – By Nela Gurevich

An attribute with continuous values • A solution: dynamic split • Sort examples according to the values of attribute A • For each possible value xiє Domain(A) • Try to split into 2 sons: ≤ xiand > xi • Measure the information gain of the split • An example: good temperature for playing tennis is between 20 and 28 deg. Intro. to AI – Tutorial 9 – By Nela Gurevich

Attribute Types:An important note • Let ATTRIB = {A1, A2, … , An} be the group of attributes available for splitting at the current node • Let Ai be the attribute chosen for split • If the domain of Ai is discrete • We will choose from ATTRIB \ Ai for splitting at the sons • If the domain of Ai values is continuous • We will choose from ATTRIB for splitting at the sons Intro. to AI – Tutorial 9 – By Nela Gurevich

The Accuracy of a Classifier • We aim to produce an accurate classifier • How can we measure an accuracy of a classifier that was produced by our algorithm? • We can know the true accuracy of the classifier by testing it on all possible examples • This is usually impossible • We can get an estimation of the classifier accuracy by testing it on a subset of all possible examples Intro. to AI – Tutorial 9 – By Nela Gurevich

Estimating an Accuracyof a Classifier • Assume that we have a labeled set of examples T. • We can split T: • Use k% of T as a training set • Use the rest (100 – k)% of T for testing • The accuracy of the classifieron the test set will provide us an estimate of the true accuracy • Note: it is important that the training set and the test set do not overlap, otherwise the accuracy estimate can be too optimistic Intro. to AI – Tutorial 9 – By Nela Gurevich

Cross Validation • A common method for estimating the accuracy of a classifier, by splitting the labeled data into non-overlapping training and testing sets • N-fold cross validation • Split the labeled data to N distinct groups • Run N experiments, in each • Use N – 1 groups of examples for learning (training set) • Use one group for testing • Average the results of the N experiments: this is the accuracy estimate Intro. to AI – Tutorial 9 – By Nela Gurevich

An Example:5-Fold Cross Validation Labeled Data Intro. to AI – Tutorial 9 – By Nela Gurevich

Learning Curves • Show the accuracy of the produced classifier as a function of the training set size • In simple words, show how classification accuracies behave, when learning with more and more examples • Note that the accuracies should be measured on the same test set, which does not overlap with any of the training sets Intro. to AI – Tutorial 9 – By Nela Gurevich

CS 236501 Introduction to AI