180 likes | 195 Views
Computational Learning. An intuitive approach. Human Learning. Objects in world Learning by exploration and who knows? Language informal training, inputs may be incorrect Programming A couple of examples of loops or recursions Medicine See one, do one, teach one
E N D
Computational Learning An intuitive approach
Human Learning • Objects in world • Learning by exploration and who knows? • Language • informal training, inputs may be incorrect • Programming • A couple of examples of loops or recursions • Medicine • See one, do one, teach one • People: few complex examples, informal, complex behavioral output
Computational Learning • Representation provided • Simple inputs: vectors of values • Simple outputs: e.g. yes or no, a number, a disease • Many examples (thousands to millions) • Quantifiable • + Useful, e.g. automatic generation of expert systems
Concerns • Generalization accuracy • Performance on unseen data • Evaluation • Noise and overfitting • Biases of representation • You only find what you look for.
Three Learning Problems • Classification: from known examples create decision procedure to guess class • Patient data -> guess disease • Regression: from known examples create decision procedure to guess real numbers • Stock data -> guess price • Clustering: putting data into “meaningful” groups • Patient Data -> new diseases
Simple data attribute-value representation • <sex: male, age: 50, smoker:true, blood_pressure = low, … disease: emphysema> = 1 example • Sex, age, smoker, etc are the attributes • Values are male, 50, true etc • Only data of this form allowed.
The Data: squares and circles ? ? ? ?
Learning a (hyper)-line • Given data • Construct line – the decision boundary • Usually defined by a normal n • data is on one side if dot product data * n >0 • Recall <x1,x2> *<y1,y2> is x1*y1+x2*y2. • What a neuron does
1-Nearest Neighbor classification • If x is a example, find the nearest neighbor NN in the data using euclidean distance. • Guess the class of c is the class of NN • K-nearest neighbor: let the k-nearest neighbors vote • Renamed as IB-k in Weka
Neural Net • A single perceptron can’t learn some simple concepts, like XOR • A multilayered network of perceptrons can learn any boolean function • Learning is not biological but follows from multivariable calculus
Gedanken experiments • Try ML algorithms on imagined data • Ex. Concept: x>y, ie. • Data looks like 3,1,+. 2,4,-. etc • Which algorithms do best? And how well? • Consider the boundaries. • My guesses: • SMO> Perceptron>NearestN>DT.
Check Guesses with Weka • 199 examples. • DT= 92.9 (called J48 in weka) • NN= 97.5 (called IB1 in weka) • SVM = 99.0 (called SMO in weka)