1 / 16

Some slide material taken from: Marakas 2003, SAS Education

DSCI 4520/5240 (DATA MINING). DSCI 4520/5240 Lecture 10 Association Analysis. Some slide material taken from: Marakas 2003, SAS Education. Objectives. Overview of Market-Basket (Association) Analysis. Market Basket Analysis.

maura
Download Presentation

Some slide material taken from: Marakas 2003, SAS Education

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. DSCI 4520/5240 (DATA MINING) DSCI 4520/5240 Lecture 10 Association Analysis Some slide material taken from: Marakas 2003, SAS Education

  2. Objectives • Overview of Market-Basket (Association) Analysis

  3. Market Basket Analysis • This is the most widely used and, in many ways, most successful data mining algorithm. • It essentially determines what products people purchase together. • Stores can use this information to place these products in the same area. • Direct marketers can use this information to determine which new products to offer to their current customers. • Inventory policies can be improved if reorder points reflect the demand for the complementary products.

  4. Association Rules for Market Basket Analysis • Rules are written in the form “left-hand side implies right-hand side” and an example is: • Yellow Peppers IMPLIES Red Peppers, Bananas, Bakery • To make effective use of a rule, three numeric measures about that rule must be considered: (1) support, (2) confidence and (3) lift

  5. Measures of Predictive Ability Support refers to the percentage of baskets where the rule was true (both left and right-side products were present). Confidence measures what percentage of baskets that contained the left-hand product also contained the right-hand. Lift measures how many times Confidence is larger than the expected (baseline) Confidence. A lift value that is greater than 1 is desirable.

  6. A A B A B B C C C D C D D E E Support and Confidence: An Illustration Support 2/5 2/5 2/5 1/5 Rule A  D C  A A  C B & C  D Lift 2 1 2 0.50 Confidence 2/3 2/4 2/3 1/3

  7. Market Basket Analysis Methodology • We first need a list of transactions and what was purchased. This is pretty easily obtained these days from scanning cash registers. • Next, we choose a list of products to analyze, and tabulate how many times each was purchased with the others. • The diagonals of the table shows how often a product is purchased in any combination, and the off-diagonals show which combinations were bought.

  8. A Convenience Store Example (5 transactions) • Consider the following simple example about five transactions at a convenience store: Transaction 1: Frozen pizza, cola, milk Transaction 2: Milk, potato chips Transaction 3: Cola, frozen pizza Transaction 4: Milk, pretzels Transaction 5: Cola, pretzels • Theseneed to be cross tabulated and displayed in a table.

  9. A Convenience Store Example (5 transactions) • Pizza and Cola sell together more often than any other combo; a cross-marketing opportunity? • Milk sells well with everything – people probably come here specifically to buy it.

  10. Using the Results • The tabulations can immediately be translated into association rules and the numerical measures computed. • Comparing this week’s table to last week’s table can immediately show the effect of this week’s promotional activities. • Some rules are going to be trivial (hot dogs and buns sell together) or inexplicable (toilet rings sell only when a new hardware store is opened).

  11. Performing Analysis with Virtual Items • The sales data can be augmented with the addition of virtual items. For example, we could record that the customer was new to us, or had children. • The transaction record might look like: • Item 1: Sweater Item 2: Jacket Item 3: New • This might allow us to see what patterns new customers have versus old customers.

  12. Limitations to Market Basket Analysis • A large number of real transactions are needed to do an effective basket analysis, but the data’s accuracy is compromised if all the products do not occur with similar frequency. • The analysis can sometimes capture results that were due to the success of previous marketing campaigns (and not natural tendencies of customers).

  13. Association Analysis inSAS Enterprise Miner (EM)

  14. The Scenario • A store wants to examine its customer base and to understand which of its products tend to be purchased together. It has chosen to conduct a market-basket analysis of a sample of its customer base. • The ASSOCS data set lists the grocery products that are purchased by 1,001 customers. Twenty possible items are included.

  15. Data Structure • Seven items were purchased by each of 1,001 customers, which yields 7,007 rows in the data set. • Each row of the data set represents a customer-product combination. • In most data sets, not all customers have the same number of products.

  16. ASSOCS data set

More Related