1 / 45

Database Marketing

Database Marketing. Cluster Analysis. Agenda. Discussion of the first Assignment Motivation for conducting Cluster Analysis Benefit Segmentation Cluster Analysis Basic Concepts Hierarchical/Non- Hierarchical Clustering Implementation in SAS and interpreting the output. Voter Profiling.

moanna
Download Presentation

Database Marketing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Database Marketing Cluster Analysis N. Kumar, Asst. Professor of Marketing

  2. Agenda • Discussion of the first Assignment • Motivation for conducting Cluster Analysis • Benefit Segmentation • Cluster Analysis • Basic Concepts • Hierarchical/Non- Hierarchical Clustering • Implementation in SAS and interpreting the output

  3. Voter Profiling • What are the different voting segments out there? What do they want to hear i.e. issues they care about? • What should I say? N. Kumar, Asst. Professor of Marketing

  4. Ad Campaign • How many customer segments are there? • How many do I want to target? • How should I target – what message should I communicate to each segment? N. Kumar, Asst. Professor of Marketing

  5. Promotional Strategies • Coupon Drops – who should they be targeted at? • Catalog Example – should the catalog be accompanied with a $5 coupon or a $10 coupon or no coupon? N. Kumar, Asst. Professor of Marketing

  6. What is Cluster Analysis? • Cluster Analysis is a technique for combining observations into groups or clusters such that: • Each group is homogenous with respect to certain characteristics (that you specify) • Each group is different from the other groups with respect to the same characteristics N. Kumar, Asst. Professor of Marketing

  7. Data N. Kumar, Asst. Professor of Marketing

  8. Geometrical View of Cluster Analysis Education Income N. Kumar, Asst. Professor of Marketing

  9. Similarity Measures • Why are consumers 1 and 2 similar? • Distance(1,2) = (5-6)2 + (5-6)2 • More generally, if there are p variables: • Distance(i,j) =  (xik - xjk)2 N. Kumar, Asst. Professor of Marketing

  10. Similarity Matrix N. Kumar, Asst. Professor of Marketing

  11. Clustering Techniques • Hierarchical Clustering • Non-Hierarchical Clustering N. Kumar, Asst. Professor of Marketing

  12. Hierarchical Clustering • Distance(1,2) = 2 = Distance(3,4) • Say, we group 1 and 2 together and leave the others as is • How do we compute the distance between a group that has two (or more) members and the others? N. Kumar, Asst. Professor of Marketing

  13. Hierarchical Clustering Algorithms • Centroid Method • Nearest-Neighbor or Single-Linkage • Farthest-Neighbor or Complete-Linkage • Average-Linkage • Ward’s Method N. Kumar, Asst. Professor of Marketing

  14. Centroid Method • Each group is replaced by an average consumer • Cluster 1 – average income = 5.5 and average education = 5.5 N. Kumar, Asst. Professor of Marketing

  15. Data for Five Clusters N. Kumar, Asst. Professor of Marketing

  16. Similarity Matrix N. Kumar, Asst. Professor of Marketing

  17. Data for Four Clusters N. Kumar, Asst. Professor of Marketing

  18. Similarity Matrix N. Kumar, Asst. Professor of Marketing

  19. Data for Three Clusters N. Kumar, Asst. Professor of Marketing

  20. Similarity Matrix N. Kumar, Asst. Professor of Marketing

  21. Dendogram for the Data C1 C2 C3 C4 C5 C6 N. Kumar, Asst. Professor of Marketing

  22. Single Linkage • First Cluster is formed in the same fashion • Distance between Cluster 1 comprising of customers 1 and 2 and customer 3 is the minimum of Distance(1,3) = 181 and Distance(2,3) = 145 N. Kumar, Asst. Professor of Marketing

  23. Similarity Matrix N. Kumar, Asst. Professor of Marketing

  24. Complete Linkage • Distance between Cluster 1 comprising of customers 1 and 2 and customer 3 is the maximum of Distance(1,3) = 181 and Distance(2,3) = 145 N. Kumar, Asst. Professor of Marketing

  25. Similarity Matrix N. Kumar, Asst. Professor of Marketing

  26. Average Linkage • Distance between Cluster 1 comprising of customers 1 and 2 and customer 3 is the average of Distance(1,3) = 181 and Distance(2,3) = 145 N. Kumar, Asst. Professor of Marketing

  27. Similarity Matrix N. Kumar, Asst. Professor of Marketing

  28. Ward’s Method • Does not compute distance between clusters • Forms clusters by maximizing within-cluster homogeneity or minimizing error sum of squares (ESS) • ESS for cluster with two observations (say, C1 and C2) = (5-5.5)2 + (6-5.5)2 + (5-5.5)2 + (6-5.5)2 N. Kumar, Asst. Professor of Marketing

  29. Ward’s Method N. Kumar, Asst. Professor of Marketing

  30. Non-Hierarchical Clustering • Data are grouped into K clusters • Requires a priori knowledge of K N. Kumar, Asst. Professor of Marketing

  31. Basic Steps in Non-Hierarchical Clustering • Select K initial cluster centroids • Assign each observation to the cluster to which it is closest • Reassign or reallocate each observation to one of the K clusters according to a pre-determined stopping rule • Stop if there is no reallocation • Approaches differ in Step 1 and/or step 3 N. Kumar, Asst. Professor of Marketing

  32. Algorithm I • Selects first K observations as cluster centers N. Kumar, Asst. Professor of Marketing

  33. Initial Cluster Centroids N. Kumar, Asst. Professor of Marketing

  34. Initial Assignment N. Kumar, Asst. Professor of Marketing

  35. New Cluster Centroids N. Kumar, Asst. Professor of Marketing

  36. Distance Matrix N. Kumar, Asst. Professor of Marketing

  37. Algorithm II • Differs from Algorithm I in how the initial seeds are modified • As before first K observations are selected as the initial cluster seeds • A seed that is a candidate for replacement is from one of the two seeds that are closest to each other • An observation qualifies to replace one of the two candidates if the distance between the seeds is less than the distance between the observation and the closest seed N. Kumar, Asst. Professor of Marketing

  38. Algorithm II …contd. • C1, C2 and C3 are the initial seeds • The smallest distance between the seeds is between C1 and C2 • Observation C4 does not qualify as a replacement as Distance(C1,C2) > Distance(C4 and the nearest seed C3) • Observation C5 does qualify as a replacement as Distance(C1,C2) < Distance(C5 and the nearest seed C3): replace C2 with C5 N. Kumar, Asst. Professor of Marketing

  39. Initial Assignment N. Kumar, Asst. Professor of Marketing

  40. New Cluster Centroids N. Kumar, Asst. Professor of Marketing

  41. Distance Matrix N. Kumar, Asst. Professor of Marketing

  42. Hierarchical vs. Non-Hierarchical Clustering • Hierarchical clustering does not require a priori knowledge of the number of clusters • Assignments are static • Use hierarchical clustering for exploratory purposes • Non-Hierarchical Methods can be viewed as a complementary rather than a competing method N. Kumar, Asst. Professor of Marketing

  43. Voter Profiling • Survey of voters concerns may help us group customers with similar concerns – perhaps they all live in a certain area? • Target ads/mailings with customized messages N. Kumar, Asst. Professor of Marketing

  44. Ad Campaign • Use attitudinal data to segment customers • Target message appropriately N. Kumar, Asst. Professor of Marketing

  45. Promotional Strategies • Use transaction data to group customers into those that are more prone to purchasing the product on deal • Give a stronger incentive to the price sensitive segment N. Kumar, Asst. Professor of Marketing

More Related