1.1k likes | 1.21k Views
790-133 Recognizing People, Objects, & Actions. Tamara Berg Machine Learning. A nnouncements. Topic presentation groups posted. Anyone not have a group yet? Last day of background material For Monday - Object recognition papers will be posted online. Please read!. What is machine learning?.
E N D
790-133 Recognizing People, Objects, & Actions Tamara Berg Machine Learning
Announcements • Topic presentation groups posted. Anyone not have a group yet? • Last day of background material • For Monday- Object recognition papers will be posted online. Please read!
What is machine learning? • Computer programs that can learn from data • Two key components • Representation: how should we represent the data? • Generalization: the system should generalize from its past experience (observed data items) to perform well on unseen data items.
Types of ML algorithms • Unsupervised • Algorithms operate on unlabeled examples • Supervised • Algorithms operate on labeled examples • Semi/Partially-supervised • Algorithms combine both labeled and unlabeled examples
K-means clustering • Want to minimize sum of squared Euclidean distances between points xi and their nearest cluster centers mk • Algorithm: • Randomly initialize K cluster centers • Iterate until convergence: • Assign each data point to the nearest center • Recompute each cluster center as the mean of all points assigned to it source: Svetlana Lazebnik
Different clustering strategies • Agglomerative clustering • Start with each point in a separate cluster • At each iteration, merge two of the “closest” clusters • Divisive clustering • Start with all points grouped into a single cluster • At each iteration, split the “largest” cluster • K-means clustering • Iterate: assign points to clusters, compute means • K-medoids • Same as k-means, only cluster center cannot be computed by averaging • The “medoid” of each cluster is the most centrally located point in that cluster (i.e., point with lowest average distance to the other points) source: Svetlana Lazebnik
Example: Image classification input desired output apple pear tomato cow dog horse Slide credit: Svetlana Lazebnik
http://yann.lecun.com/exdb/mnist/index.html Slide from Dan Klein
Example: Seismic data Earthquakes Surface wave magnitude Nuclear explosions Body wave magnitude Slide credit: Svetlana Lazebnik
The basic classification framework y = f(x) • Learning: given a training set of labeled examples{(x1,y1), …, (xN,yN)}, estimate the parameters of the prediction function f • Inference: apply f to a never before seen test examplex and output the predicted value y = f(x) output classification function input Slide credit: Svetlana Lazebnik
Some ML classification methods Neural networks Nearest neighbor 106 examples LeCun, Bottou, Bengio, Haffner 1998 Rowley, Baluja, Kanade 1998 … Shakhnarovich, Viola, Darrell 2003 Berg, Berg, Malik 2005 … Conditional Random Fields Support Vector Machines and Kernels Guyon, Vapnik Heisele, Serre, Poggio, 2001 … McCallum, Freitag, Pereira 2000 Kumar, Hebert 2003 … Slide credit: Antonio Torralba
Example: Training and testing • Key challenge: generalization to unseen examples Training set (labels known) Test set (labels unknown) Slide credit: Svetlana Lazebnik
Classification by Nearest Neighbor Word vector document classification – here the vector space is illustrated as having 2 dimensions. How many dimensions would the data actually live in? Slide from Min-Yen Kan
Classification by Nearest Neighbor Slide from Min-Yen Kan
Classification by Nearest Neighbor Classify the test document as the class of the document “nearest” to the query document (use vector similarity to find most similar doc) Slide from Min-Yen Kan
Classification by kNN Classify the test document as the majority class of the k documents “nearest” to the query document. Slide from Min-Yen Kan
Classification by kNN What are the features? What’s the training data? Testing data? Parameters? Slide from Min-Yen Kan
Classification by kNN What are the features? What’s the training data? Testing data? Parameters? Slide from Min-Yen Kan
NN for vision Fast Pose Estimation with Parameter Sensitive Hashing Shakhnarovich, Viola, Darrell
NN for vision J. Hays and A. Efros, Scene Completion using Millions of Photographs, SIGGRAPH 2007
NN for vision J. Hays and A. Efros, IM2GPS: estimating geographic information from a single image, CVPR 2008
Decision tree classifier Example problem: decide whether to wait for a table at a restaurant, based on the following attributes: • Alternate: is there an alternative restaurant nearby? • Bar: is there a comfortable bar area to wait in? • Fri/Sat:is today Friday or Saturday? • Hungry: are we hungry? • Patrons: number of people in the restaurant (None, Some, Full) • Price: price range ($, $$, $$$) • Raining: is it raining outside? • Reservation: have we made a reservation? • Type: kind of restaurant (French, Italian, Thai, Burger) • WaitEstimate: estimated waiting time (0-10, 10-30, 30-60, >60) Slide credit: Svetlana Lazebnik
Decision tree classifier Slide credit: Svetlana Lazebnik
Decision tree classifier Slide credit: Svetlana Lazebnik
Linear classifier • Find a linear function to separate the classes f(x) = sgn(w1x1 + w2x2 + … + wDxD) = sgn(w x) Slide credit: Svetlana Lazebnik