Multivariate linear models for regression and classification Outline:

Multivariate linear models for regression and classification Outline: 1) multivariate linear regression 2) linear classification (perceptron) 3) logistic regression

Multivariate Linear Classification The perceptron (lecture 3 on amlbook.com)

Review: perceptron applied to credit approval

Integrate “threshold” into input vector

Perceptron learning algorithm (PLA) sign(wTx) negative sign(wTx) positive Each iteration pulls boundary in direction that tends to correct misclassification

If data is linearly separable, iteration terminates when Ein(g) = 0 Otherwise terminate by maximum number of iterations A graphical demo of the Perceptron Learning Algorithm(in Octave) is on AMLbook.com

Classification for digit recognition Examples of hand-written digits from zip codes

Given 16x16 pixel image, could have model with 257 weights (attributes) Better to develop a small number of attributes (features) from the raw data Regardless of attributes chosen, x0 = 1 is included

2-attribute model: intensity and symmetry Intensity: how much black is in the image Symmetry: how similar are mirror images

PLA is unstable. No obvious trend toward min Ein Correcting misclassification of one example may cause others to become misclassified

Pocket algorithm (text p80) Keeps the best PLA result obtained thus far Must evaluate Ein(number of misclassified examples) after each update to determine if it is better than previous best

PLA & Pocket final results of 1 vs 5 classification • Just like in Assignment 4, multivariate linear regression • can be used as an alternative to PLA and Pocket algorithm. • PLA and Pocket algorithm require class labels to be +1. • Multivariate linear regression can use any 2 distinct • real numbers r1and r2.

Classification boundaries • yfit (x) = w0 + w1 x1 + w2 x2 • yfit (x) = (r1 + r2)/2 defines the a function of x1and x2 • that is the boundary • Solve this function for x2as a function of x1

Assignment 5: due 10-21-14 Part 1: Apply Pocket algorithm (text p80) to 1vs5 classification Two attribute files (intensity, symmetry) are on class webpage. Use the smaller (424 examples) for training. Use the larger (1561 examples) get the following results: 1) Etestas percent misclassified 2) Confusion matrix as percentages 3) Scatter plot with boundary (as in slide 12) Part2: Compare Pocket algorithm results with those obtained by multivariate linear regression. Again, use smaller set for training and larger for testing. Report the same results as above. CalculateEtest as mean of squared residuals by leave-one-out (424 runs). Compare to Etest as mean of squared residuals calculated from the test set.

Multivariate linear models for regression and classification Outline: