1 / 53

Machine Learning

Machine Learning. Kenton McHenry, Ph.D. Research Scientist. Raster Images. image(234, 452) = 0.58. [ Hoiem , 2012]. Neighborhoods of Pixels. For nearby surface points most factors do not change much Local differences in brightness. [ Hoiem , 2012]. Features. Feature Descriptors.

talon
Download Presentation

Machine Learning

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Machine Learning Kenton McHenry, Ph.D. Research Scientist

  2. Raster Images image(234, 452) = 0.58 [Hoiem, 2012]

  3. Neighborhoods of Pixels • For nearby surface points most factors do not change much • Local differences in brightness [Hoiem, 2012]

  4. Features

  5. Feature Descriptors • Shapes • Curves • Color • Mean • Distribution • Texture • Filter banks • Size • Statistics • Neighbors

  6. Descriptors, Why? • Matching: Match an object in two images based on similar features • 3D Reconstruction: Stereopsis • Tracking: Follow an object in a video by following its features • Object Recognition: Find objects based on known features it will posses • Segmentation: Break an image up into more meaningful regions based on the seen features.

  7. Object Recognition • Use a collection of features specific to an object to identify it in new images

  8. Object Recognition • Examples! • We have example features from the object we wish to find • We also have example features of stuff that isn’t the object

  9. Object Recognition

  10. Supervised Learning • Labeled data sets • FYI, this is a lot of work! • [GIMP Demo]

  11. Supervised Learning • Labeled data sets • This is a lot of work! • The more images the better (100-1000) • Extract features • Features from labeled portions are positive examples • Features outside labeled portions are negative examples

  12. Machine Learning • Matlab Statistics Toolbox

  13. Decision Trees • Decide whether to wait for a table at a restaurant? • Input is a situation described by a set of properties • Feature Descriptor • Output is a decision • e.g. Yes or No [Lesser, 2010]

  14. Decision Trees • Construct a root note containing all examples • Split nodes that have examples from more than one thing • Choose an attribute that best splits the data • Entropy(S) = -p+ log(p+) - p-log(p-) • Information gain • If nodes have only examples from one thing have them output the label of that • Greedy algorithm • Not necessarily optimal • But faster [Lesser, 2010]

  15. Decision Trees [Lesser, 2010]

  16. Decision Trees Height Eyes Hair [Lesser, 2010]

  17. Decision Trees Hair Blond Dark Red Height Eyes [Lesser, 2010]

  18. Decision Trees Hair Blond Red Dark Eyes Brown Blue [Lesser, 2010]

  19. Decision Trees • We’ll use binary decision trees on continuous valued features • Instead of entropy select thresholds on each feature and see how many examples are correctly classified. • Pick the best feature threshold

  20. Supervised Learning • I = imread(‘scar1.jpg’); • mask = imread(‘scar1_mask.png’) • mask = sum(mask, 3) • mask = mask > 0 • Ir = I(:,:,1); • Ig = I(:,:,2); • Ib = I(:,:,3); • Xp = [Ir(mask) Ig(mask) Ib(mask)]; • Xn = [Ir(~mask) Ig(~mask) Ib(~mask)]; • plot3(Xp(:,1),Xp(:,2),Xp(:,3),’r.’); • axis vis3d; • hold on; • plot3(Xn(:,1),Xn(:,2),Xn(:,3),’b.’);

  21. Supervised Learning • X = double([Xp; Xn]); • Y = [repmat({‘1’}, size(Xp,1), 1); • repmat({‘-1’}, size(Xn,1), 1)]; • tree = classregtree(X,Y); • view(tree); • Y1 = t.eval(X); • sum(strcmp(Y,Y1)) / length(Y)

  22. Supervised Learning • indices = rand(size(Xp,1),1)>0.5; • Xp1 = Xp(indices); • Xp2 = Xp(~indices); • indices = rand(size(Xn,1),1)>0.5; • Xn1 = Xn(indices); • Xn2 = Xn(~indices); • Xtrain = double([Xp1 Xn1]); • Ytrain= [repmat({‘1’}, size(Xp1,1), 1); • repmat({‘-1’}, size(Xn1,1), 1)]; • Xtest= double([Xp2 Xn2]); • Ytest= [repmat({‘1’}, size(Xp2,1), 1); • repmat({‘-1’}, size(Xn2,1), 1)];

  23. Overfitting • Construct classifier to specifically to training data • Not generalizable • Split your labeled data into two data sets • Training set • Test set • Use test set to verify how well your constructed classifier generalizes to new unseen data

  24. Supervised Learning • tree = classregtree(Xtrain,Ytrain); • view(tree); • Y = t.eval(Xtrain); • sum(strcmp(Y,Ytrain)) / length(Ytrain) • Y = t.eval(Xtest); • sum(strcmp(Y,Ytest)) / length(Ytest)

  25. Supervised Learning • tree = classregtree(Xtrain,Ytrain, ‘splitmin’, 100); • view(tree); • Y = t.eval(Xtrain); • sum(strcmp(Y,Ytrain)) / length(Ytrain) • Y = t.eval(Xtest); • sum(strcmp(Y,Ytest)) / length(Ytest)

  26. Supervised Learning • tp = sum(and(strcmp(Ytest,’1’),strcmp(Y,’1’))) • fp= sum(and(strcmp(Ytest,’-1’),strcmp(Y,’1’))) • tn= sum(and(strcmp(Ytest,’-1’),strcmp(Y,’-1’))) • fn= sum(and(strcmp(Ytest,’1’),strcmp(Y,’-1’))) • precision = tp / (tp + fp) • recall = tp / (tp + fn) • accuracy = (tp + tn) / (tp + tn + fp + fn)

  27. Precision Recall Curves • Plot precision on y-axis and recall on x-axis as you alter parameters of classifier during training • Determine ideal parameters • Upper right corner • Compare classifiers • Area under curve Precision Recall

  28. Well this did pretty good, right? • What must we keep in mind?

  29. Supervised Learning • A wide variety of methods: • Naïve Bayes • Neural Nets • Nearest Neighbors • Support Vector Machines • …

  30. Supervised Learning • A wide variety of methods: • Naïve Bayes • Probability of a given class given a feature descriptor • Treat features as independent of one another to make tractable • Neural Nets • Nearest Neighbors • Support Vector Machines • … http://en.wikipedia.org/wiki/Naive_bayes

  31. Supervised Learning • A wide variety of methods: • Naïve Bayes • Neural Nets • Model after human brain • Network of neurons • Perceptron • Binary classifier • Learning • Iteratively adapt weights based on error • Backpropogation for multilayer networks • Iteratively adapt weights based on error • Nearest Neighbors • Support Vector Machines • … http://en.wikipedia.org/wiki/Perceptron

  32. Supervised Learning • A wide variety of methods: • Naïve Bayes • Neural Nets • Nearest Neighbors • Look at nearby examples in feature space • Support Vector Machines • …

  33. Supervised Learning • A wide variety of methods: • Naïve Bayes • Neural Nets • Nearest Neighbors • Support Vector Machines • Find a plane that divides the data so as to maximize the gaps between different things • … http://en.wikipedia.org/wiki/Support_Vector_Machines

  34. Unsupervised Learning • No labeled data

  35. Unsupervised Learning • No labeled data • Instead find hidden structure in the feature space • No error to evaluate by

  36. Kmeans • Set the number of groups to look for, k • Assume groups are circular • Assume things can only belong to one group • Find center positions for each group so as to minimize: http://en.wikipedia.org/wiki/Kmeans

  37. Kmeans • Solver iteratively • Randomly set the positions (i.e. means) for the k groups • Assign each feature descriptor (i.e. point) to the nearest group • Calculate the mean of the assigned points to a group and use as the new center position • Repeat

  38. Kmeans

  39. Kmeans

  40. Kmeans

  41. Kmeans

  42. Kmeans

  43. Kmeans

  44. Kmeans

  45. Now what? • We can use this to group together similar stuff • Things that may belong together according to the feature space • We can assign new data to the nearest group • Don’t know what that group is but do know it is similar to this other data • Can label groups manually

  46. Unsupervised Learning • I = imread(‘scar1.png’) • Ir = I(:,:,1); • Ig = I(:,:,2); • Ib = I(:,:,3); • X = double([Ir(:) Ig(:) Ib(:)]); • [indices,centers] = kmeans(X,2); • image(uint8(reshape(C(1,:),1,1,3))); • image(uint8(reshape(C(2,:),1,1,3))); • indices • imagesc(reshape(indices==1,size(I,1),size(I,2)); • imagesc(reshape(indices==2,size(I,1),size(I,2));

  47. Unsupervised Learning • A wide variety of methods: • Gaussian Mixture Models • Hierarchical Agglomerative Clustering • Principal Component Analysis • …

  48. Unsupervised Learning • A wide variety of methods: • Gaussian Mixture Models • Kmeans is a restricted version of this • Gaussian distributions • No circular group assumption • Data can belong to more than one group • Weighted • Similar algorithm • Assign weights to each point for each group • Calculate group position as mean and standard deviation • Tends to suffer from numerical issues in practice • Hierarchical Agglomerative Clustering • Principal Component Analysis • …

  49. Unsupervised Learning • A wide variety of methods: • Gaussian Mixture Models • Hierarchical Agglomerative Clustering • Create groups as nodes within a tree over the data • Construction • Find nearest to points and merge • Create a new point as the average of those points • Place original points as children of this new node • Remove original points from future consideration • Repeat • Costly to build • Efficiently search large amounts of data • Principal Component Analysis • … http://en.wikipedia.org/wiki/Hierarchical_clustering

  50. Unsupervised Learning • A wide variety of methods: • Gaussian Mixture Models • Hierarchical Agglomerative Clustering • Principal Component Analysis • Identify most significant dimensions of the data • Most likely not lined up with coordinate axes • Use as a vocabulary to define data • May not require less dimensions • Compression • … http://en.wikipedia.org/wiki/Principal_component_analysis

More Related