1 / 14

Cluster Analysis

Cluster Analysis. Measuring latent groups. Cluster Analysis - Discussion. Definition Vocabulary Simple Procedure SPSS example ICPSR and hands on. Definition.

goldbergm
Download Presentation

Cluster Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Cluster Analysis Measuring latent groups

  2. Cluster Analysis - Discussion • Definition • Vocabulary • Simple Procedure • SPSS example • ICPSR and hands on

  3. Definition • Cluster analysis is a process by which we take a large number of cases (read that observations across respondents) and reduce them into a smaller number of mutually exclusive “groups”, by “clustering” the shared variation among respondents across variables. The result is a “grouping” for each case across all variables.

  4. Vocabulary and Procedure There are essentially two steps in Cluster Analysis: 1. First is to create a table of relative similarities or differences between all objects. The table of relative similarities is called a proximities matrix. 2. Use this information to combine the objects into groups. The method of combining objects into groups is called a clustering algorithm. The idea is to combine objects that are similar to one another into the same group.

  5. Vocabulary and Procedure (cont.) • In this respect, cluster analysis is the obverse of factor analysis. Whereas factor analysis reduces the number of variables by grouping them into a smaller set of factors, cluster analysis reduces the number of observations or cases by grouping them into a smaller set of clusters. • The obvious challenge is to determine which variables to include across observations and how to combine such variables, once they are chosen.

  6. Proximities Matrix

  7. Proximities Matrix (cont.)

  8. Clustering – Flat Method There are two types of clustering methods—flat and hierarchical. If the number of groups is known beforehand, the "flat" method works. In SPSS, this is called K-means clustering. Using this method, the objects are assigned to a given group at the first step based on some initial criterion. The means for each group are then calculated. The next step reshuffles the objects into groups, assigning objects to groups based on the object's similarity to the current mean of that group. The means of the groups are recalculated at the end of this step. This process continues recursively until no objects change groups.

  9. Clustering – Hierarchical Method If the groups are not known a priori, hierarchical clustering works better. There are two kinds: Divisive – Starts with all observations in one groups and continues to divide into subgroups until no further distinction can be made. Agglormerative – starts with each observation as a separate group and continues to pair observations until all groups are formed.

  10. Steps in the Analysis • Input the data • Choose the method for grouping • Generate the Output • Interpret the results

  11. Input the data

  12. Generate the Procedure

  13. Produce the Output

  14. Produce the Output (cont.)

More Related