1 / 17

Debrup Chakraborty

Non Parametric Methods. Pattern Recognition and Machine Learning. Debrup Chakraborty. Nearest Neighbor classification. Given: Given a labeled sample of n feature vectors ( call X) A distance measure (say the Euclidian Distance). To find:

hogan
Download Presentation

Debrup Chakraborty

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Non Parametric Methods Pattern Recognition and Machine Learning Debrup Chakraborty

  2. Nearest Neighbor classification Given: Given a labeled sample of n feature vectors ( call X) A distance measure (say the Euclidian Distance) To find: The class label of a given feature vector xwhich is not in X

  3. Nearest Neighbor classification (contd.) The NN rule: Find the point y in X which is nearest to x Assign the label of y to x

  4. Nearest Neighbor classification (contd.) This rule allows us to partition the feature space into cells consisting of all points closer to a given training point x All points in such cells are labeled by the class of the training point. This partitioning is called a Voronoi Tesselation

  5. Nearest Neighbor classification (contd.) Voronoi Cells in 2d

  6. Nearest Neighbor classification Complexity of the NN rule Distance calculation Finding the minimum distance

  7. Nearest Neighbor classification Nearest Neighbor Editing X= Data set, n= no of training points, j=0 Construct the full Voronoi diagram for X Doj=j+1, for each point xj in X find Voronoi neighbors of xj If any neighbor is not from the same class as xj then mark xj Untilj==n Discard all points that are not marked.

  8. k nearest neighbor classification Given: Given a labeled sample of N feature vectors ( call X) A distance measure (say the Euclidian Distance) An integer k (generally odd) To find: The class label of a given feature vector x which is not in X

  9. k-NN classification (contd.) Algorithm: Find out the k nearest neighbors of x in X Call them Out of the k samples, let ki of them belong to class ci . Choose that ci to be the class of x for which ki is maximum

  10. K-nn Classification Class 1 Class 2 Class 3 z

  11. k-NN classification (contd.) Distance weighted nearest neighbor In case x=xi, return f(xi) Training set Given an instance x to be classified Let be the nearest neighbors of x Return

  12. Remarks on k-NN classification • The distance weighted kNN is robust to noisy training data and is quite effective when it is provided a sufficiently large set of training examples. • One drawbak of kNN method is that, it defers all computation till a new querry point is presented. Various methods have been developed to index the training examples so that the nearest neighbor can be found with less search time. One such indexing method is kd-tree developed by Bently 1975 • kNN is a lazy learner

  13. Locally Weighted Regression • In the linear regression problem, to find h(x) at a point x we would do the following: • Minimize • Output

  14. Locally Weighted Regression • In the llocally weighted regression problem we would do the following • Minimize • Output • A standard choice of weights is • is called the bandwidth parameter

  15. Clustering Is different from Classification Classification is partitioning the feature space whereas Clustering is partitioning the data into“homogeneous groups” Clustering is Unsupervised!!

  16. K-means Clustering Given: A data set Fix the number of clusters K Let represent the i-th cluster center (prototype) at the k-th iteration Let represent the j-th cluster at the k-th iteration

  17. K-means Clustering Steps • Choose the initial cluster centers • At the k-th iterative step distribute the points in X in K cluster using: • Compute • If then the procedure has • converged else repeat from 2.

More Related