1 / 29

Pattern Recognition: Statistical and Neural

This lecture discusses the general comments about the clustering problem and presents small programs that can be used for performing clustering. The lecture covers various clustering algorithms and their applications.

rkristi
Download Presentation

Pattern Recognition: Statistical and Neural

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Nanjing University of Science & Technology Pattern Recognition:Statistical and Neural Lonnie C. Ludeman Lecture 30 Nov 11, 2005

  2. Lecture 30 Topics • General Comments about the Clustering Problem • Present my small programs that can be used for performing clustering • 3. Demonstrate the programs • 4. Closing Comments

  3. Clusteringis the art of grouping together pattern vectors that in some sense belong together because they have similar characteristics and are different from other pattern vectors. In the most general problem the number of clusters or subgroups is unknown as are the properties that make them similar. Review

  4. Question: How do we start the process of finding clusters and identifying similarities??? Answer: First realize that clustering is an art and there is no correct answer only feasible alternatives. Secondexplore structures of data, similarity measures, and limitations of various clustering procedures Review

  5. Problems in performing meaningful clustering Scaling The nonuniqueness of results Programs always give clusters even when there are no clusters Review

  6. There are no correct answers, the clusters provide us with different interpretations of the data where the closeness of patterns is measured with different definitions of similarity. The results may produce ways of looking at the data that we have not considered or noticed. These structural insights may prove useful in the pattern recognition process. Review

  7. Methods for Clustering Quantitative Data 1. K-Means Clustering Algorithm 2. Hierarchical Clustering Algorithm 3. ISODATA Clustering Algorithm 4. Fuzzy Clustering Algorithm Review

  8. K-Means Clustering Algorithm Randomly Select K cluster centers from Pattern Space Distribute set of patterns to the cluster center using minimum distance Compute new Cluster centers for each cluster Continue this process until the cluster centers do not change. Review

  9. Agglomerative Hierarchical Clustering Consider a set S of patterns to be clustered S = { x1, x2, ... , xk, ... , xN} Define Level N by S1(N)= { x1} Clusters at level N are the individual pattern vectors S2(N)= { x2} ... SN(N)= { xN} Review

  10. Define Level N -1 to be N – 1 Clusters formed by merging two of the Level N clusters by the following process. Compute the distances between all the clusters at level N and merge the two with the smallest distance (resolve ties randomly) to give the Level N-1 clusters as S1(N-1) Clusters at level N -1 result from this merging S2(N-1) ... SN-1(N-1) Review

  11. The process of merging two clusters at each step is performed sequentially until Level 1 is reached. Level one is a single cluster containing all samples S1(1)= { x1, x2, ... , xk, ... , xN} Thus Hierarchical clustering provides cluster assignments for all numbers of clusters from N to 1. Review

  12. 1 1 1 2 2 2 ) ] C C C Fuzzy C-Means Clustering Preliminary Given a set S composed of pattern vectors which we wish to cluster S = { x1, x2, ... , xN} Define C Cluster Membership Functions ... ... C Review

  13. Define C Cluster Centroids as follows Let Vibe the Cluster Centroid for Fuzzy Cluster Cli , i = 1, 2, …, C Define a Performance Objective Jmas where Review

  14. Definitions Ais a symmetric positive definite matrix Nsis total number of pattern vectors m = Fuzziness Index (m >1 ) Higher numbers being more fuzzy The Fuzzy C-Means Algorithm minimizes Jmby selecting Viand i,i=1, 2, … , C by an alternating iterative procedure as described in the algorithm’s details Review

  15. Fuzzy C-Means Clustering Algorithm (a) Flow Diagram Review Yes No

  16. General Programs for Performing Clustering 1. Available commercial Packages: SPSS , SAS, GPSS, 2. Small Programs for classroom use LCLKmean.exe LCLHier.exe LCLFuzz.exe

  17. 2. Small Programs for classroom use LCLKmean.exe LCLHier.exe LCLFuzz.exe Use the K-Means Algorithm to cluster small data sets Performs Hierarchical Clustering of small data sets Performs Fuzzy and crisp clustering of small data sets

  18. Data File Format for the LCL Programs NS = Number of data samples VS= Data vector size DATA in row vectors with space between components NS 5 3 1 6 3 2 0 5 7 1 4 6 6 8 2 2 3 Text File VS DATA

  19. Food for Thought All the clustering techniques presented so far use a measure of distance or similarity. Many of these give equal distance contours that represent hyper spheres and hyper ellipses. If these techniques are used directly on patterns that are not describable by those type of regions we can expect to obtain poor results.

  20. In some cases each cluster occupies a limited region (subspace of the total pattern space ) described by a nonlinear functional relation between components. An example appears below. Existing Pattern vectors Existing Pattern Vectors Standard K-Means, Hierarchical, or Fuzzy cluster analysis directly on the data will produce unsatisfactory results.

  21. For this type of problem the patterns should be first preprocessed before a clustering procedure is performed . Two almost contradictory approaches can be used for this processing. 1. Extend the pattern space by techniques comparable to functional link nets so that the clusters can be separated by spherical and elliptical regions. 2. Reduce the dimension of the space by a nonlinear form of processing involving principal component like processing before clustering.

  22. Both methods imply that we know additional information about the structure of the data. This additional information may be known to us or it may need to be determined. The process of finding structure within data has been put in the large category of “Data Mining”. So get a shovel and start looking. Good luck in your search for gold in the mounds of practical data.

  23. Several very important topics in Pattern Recognition were not covered in this course because of time limitations. The following topics deserve your special attention to make your educational experience complete 1. Feature Selection and Extraction 2. Hopfield and feedback neural nets 3. Syntactical Pattern Recognition 4. Special Learning Theory

  24. Like to Thank Nanjing University of Science & Technology and Lu Jian Feng Yang Jing-yu Wang Han for inviting me to present this course on Statistical and Neural Pattern Recognition

  25. AVery Special Thanksto my new friends Lu Jian Feng Wang Qiong Wang Huan for looking after me. Their kindness and gentle assistance has made my stay in Nanjing a very enjoyable and unforgettable experience.

  26. Last and not least I would like to thank all you students for your kind attention throughout this course. Without your interest and cheerful faces it would have been difficult for me to teach. My apology for teaching in English, which I am sure, made your work a lot harder. Best of Luck to all of you in your studies and life.

  27. “As you travel through life may all your trails be down hill and the wind always be at your back”. Bye for now and I hope our paths cross again in the future. I will have pleasant thoughts about NUST Sudents and Faculty, Nanjing, and China as I head back to New Mexico !

  28. New Mexico Land of Enchantment

  29. End of Lecture 30

More Related