1 / 9

Lazy learning vs. eager learning

Lazy learning vs. eager learning. Processing is delayed until a new instance must be classified Pros: Classification hypothesis is developed locally for each instance to be classified Cons: Running time (no model is built, so each classification actually builds a local model from scratch).

Download Presentation

Lazy learning vs. eager learning

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lazy learning vs. eager learning • Processing is delayed until a new instance must be classified • Pros: • Classification hypothesis is developed locally for each instance to be classified • Cons: • Running time (no model is built, so each classification actually builds a local model from scratch)

  2. K-Nearest Neighbors • Classification of new instances is based on classifications of (one or more) known instances nearest to them • K=1  1-NN (using a single nearest neighbor) • Frequently, K > 1 • Assumption: all instances correspond to points in the n-dimensional space Rn • Dimensions = features (aka attributes)

  3. Metrics • Nearest neighbors are identified using a metric defined for this high-dimensional space • Let x be an arbitrary instance with feature vector <f1(x), f2(x), …, fn(x)> • Euclidean metric is frequently used for real-valued features:

  4. Pseudo-code for KNN • Training algorithm • For each training example <x, class(x)>, add the example to the list Training • Classification algorithm (Rn V) • Let V ={v1, …, vl} be a set of classes • Given a query instance xq to be classified • Let X={x1, …, xk} denote the k instances from Training that are nearest to xq • Return vi such that |votei|is largest

  5. Distance-weighted KNN • Weighting contribution of each of the k neighbors according to their distance to the query point xq • Give greater weight to closer neighbors • Return vi such that |wi|is largest

  6. Distance-weighted KNN (cont’d) • If xq exactly matches one of the training instances xi, and d(xq, xi)=0, • then we simply take class(xi) to be the classification of xq

  7. Remarks on KNN • Highly effective learning algorithm • The distance between instances is calculated based on all features • If some features are irrelevant, or redundant, or noisy, then KNN suffers from the curse of dimensionality • In such a case, feature selection must be performed prior to invoking KNN

  8. Home assignment #4: Feature selection • Compare the following algorithms • ID3 – regular ID3 with internal feature selection • KNN.all – KNN that uses all the features available • KNN.FS – KNN with a priori feature selection (IG) • Two datasets: • Spam email • Handwritten digits • You don’t have to understand the physical meaning of all the coefficients involved !

  9. Cross-validation • Averaging the accuracy of a learning algorithm over a number of experiments • N-fold cross-validation: • Partition the available data D into N disjoint subsets T1, …, TN of equal size (|D| / N) • For n from 1 to N do • Training = D \ Ti , Testing = Ti • Induce a classifier using Training, test it on Testing, and measure the accuracy Ai • Return (∑ Ai ) / N (cross-validated accuracy)

More Related