70 likes | 199 Views
More on Choosing #Clusters in General . References Breckenridge, James N. (2000), “Validating Cluster Analysis: Consistent Replication and Symmetry,” Multivariate Behavioral Research , 35 (2), 261-285.
E N D
More on Choosing #Clusters in General • References • Breckenridge, James N. (2000), “Validating Cluster Analysis: Consistent Replication and Symmetry,” Multivariate Behavioral Research, 35 (2), 261-285. • Calinski, R. B. and J. Harabasz (1974), “A Dendrite Method for Cluster Analysis,” Communications in Statistics, 3, 1-27. • Krolak-Schwerdt, Sabine and Thomas Eckes (1992), “A Graph Theoretic Criterion for Determining the Number of Clusters in a Data Set,” Multivariate Behavioral Research, 27 (4), 541-565. • Milligan, Glenn W. and Martha C. Cooper (1985), “An Examination of Procedures for Determining the Number of Clusters in a Data Set,” Psychometrika, 50, 159-179. • Steinley, Douglas and Michael J. Brusco (2011), “Choosing the Number of Clusters in K-Means Clustering,” Psychological Methods, 16 (3), 285-297.
References: Articles • Goodman, Leo A. and William H. Kruskal (1954), “Measures of Association for Cross Classification” Journal of the American Statistical Association, 49, 732-764. • Measures like correlations (r’s) but for categorical data • Hartigan, John A. and M. A. Wong (1979), “A K-Means Clustering Algorithm,” Applied Statistics, 28, 100-108. • K-means and the Fortran code (hehehe, how cool & nerdy is that?!) • Johnson, Stephen C. (1967), “Hierarchical Clustering Schemes,” Psychometrika, 32 (3), 241-254. • “Hierarchy” is defined, single-link & complete-link are introduced • Lance, G. N. and W. T. Williams (1967), “A General Theory of Classificatory Sorting Strategies, I. Hierarchical Systems,” Computer Journal, 9, 373-380. • The equation that subsumes single, complete, average, Ward’s, etc. • Milligan, Glenn W. (1979), “Ultrametric Hierarchical Clustering Algorithms,” Psychometrika, 44 (3), 343-346. • Extends ultrametric distances • Ward, Joe H., Jr. (1963), “Hierarchical Grouping to Optimize an Objective Function,” Journal of the American Statistical Association, 58 (301, March), 236-244. • The Ward of Ward’s method
References: Books • Aldenderfer, Mark S., and Roger K. Blashfield (1984), Cluster Analysis, Newbury Park, CA: Sage. • Great succinct intro • Hartigan, John (1975), Clustering algorithms, NY: Wiley. • Has the fortran code for a bunch of algorithms • Sneath, Peter H. A. and Robert R. Sokal (1973), Principles of Numerical Taxonomy, San Francisco: Freeman. • Solid, examples are from a diff field (bio) but refreshing at the same time Cluster analysis also appears as a chapter in most multivariate stats books, such as: • Seber, G.A.F. (1984), Multivariate Observations, NY: Wiley, Ch.7, pp.347-394.
References: Articles • Arabie, Phipps, J. Douglas Carroll, Wayne DeSarbo, and Jerry Wind (1981), “Overlapping Clustering: A New Method for Product Positioning,” Journal of Marketing Research 18 (Aug.), 310-317. • Cool model for non-hierarchical clustering • Punj, Girish, and David W. Stewart (1983), “Cluster Analysis in Marketing Research: Review and Suggestions for Application,” Journal of Marketing Research 20 (May), 134-148. • Illustrates a wide variety of applications of clustering
Recommendation Engines & Clustering • Iacobucci, Dawn, Phipps Arabie and AnandBodapati (2000), “Recommendation Agents on the Internet,” Journal of Interactive Marketing, 14 (3), 2-11. • Bodapati, Anand V. (2008), “Recommendation Systems with Purchase Data,” Journal of Marketing Research, 45 (Feb.), 77-93.
Other Clustering Applications • Parkman, Margaret A. and Jack Sawyer (1967), “Dimensions of Ethnic Intermarriage in Hawaii,” American Sociological Review, 32 (4), 593-607.
Clustering Related • McCutcheon, Allan L. (1987), Latent Class Analysis, Newbury Park, CA: Sage. • Smithson, Michael and Jay Verkuilen (2006), Fuzzy Set Theory: Applications in the Social Sciences, Thousand Oaks, CA: Sage.