1 / 16

Market Basket Analysis with Association Rules: Understanding Customer Buying Habits

Market Basket Analysis identifies buying patterns by analyzing items purchased together. Association Rules reveal how products relate, aiding in layout, assortment decisions and promotions.

barrym
Download Presentation

Market Basket Analysis with Association Rules: Understanding Customer Buying Habits

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Association Rules presented by Zbigniew W. Ras*,#) *) University of North Carolina, Charlotte #) Warsaw University of Technology

  2. Market Basket Analysis (MBA) Customer buying habits by finding associations and correlations between the different items that customers place in their “shopping basket” Milk, eggs, sugar, bread Milk, eggs, cereal, bread Eggs, sugar Customer1 Customer2 Customer3

  3. Market Basket Analysis Given: a database of customer transactions, where each transaction is a set of items Find groups of items which are frequently purchased together

  4. Goal of MBA • Extract information on purchasing behavior • Actionable information: can suggest • new store layouts • new product assortments • which products to put on promotion MBA applicable whenever a customer purchases multiple things in proximity

  5. Association Rules • Express how product/services relate to each other, and tend to group together • “if a customer purchases three-way calling, thenwill also purchase call-waiting” • Simple to understand • Actionable information: bundle three-way calling and call-waiting in a single package

  6. Basic Concepts Transactions: Relational format Compact format <Tid,item><Tid,itemset> <1, item1> <1, {item1,item2}> <1, item2> <2, {item3}> <2, item3> Item: single element, Itemset: set of items Supportof an itemset I [denoted by sup(I)]:card(I) Threshold for minimum support:  Itemset I is Frequent if: sup(I). Frequent Itemset represents set of items which are positively correlated itemset

  7. Frequent Itemsets Customer 1 sup({dairy}) = 3 sup({fruit}) = 3 sup({dairy, fruit}) = 2 If= 3, then {dairy}and{fruit}are frequent while {dairy,fruit}is not. Customer 2

  8. Association Rules: AR(s,c) • {A,B} - partition of a set of items • r = [AB] Supportofr: sup(r) = sup(AB) Confidenceofr: conf(r) = sup(AB)/sup(A) • Thresholds: • minimum support - s • minimum confidence –c r  AS(s, c), if sup(r)  sandconf(r)  c

  9. Association Rules - Example Min. support – 2 [50%] Min. confidence - 50% • For rule A  C: • sup(A  C) = 2 • conf(A  C) = sup({A,C})/sup({A}) = 2/3 • The Apriori principle: • Any subset of a frequent itemset must be frequent

  10. The Apriori algorithm[Agrawal] • Fk : Set of frequent itemsets of size k • Ck : Set of candidate itemsets of size k F1:= {frequent items}; k:=1; while card(Fk)  1 do begin Ck+1 := new candidates generated from Fk; for each transaction t in the database do increment the count of all candidates in Ck+1 that are contained in t ; Fk+1 := candidates in Ck+1 with minimum support k:= k+1 end Answer := { Fk: k  1 & card(Fk)  1}

  11. a,b,c,d a, b, c a, b, d a, c, d b, c, d a, b a, c a, d b, c b, d c, d a b c d Apriori - Example {a,d}is not frequent, so the 3-itemsets{a,b,d},{a,c,d}and the4-itemset {a,b,c,d},are not generated.

  12. Algorithm Apriori: Illustration Large support items {A} 3 {B} 2 {C} 2 {A,C} 2 • The task of mining association rules is mainly to discover strong association rules (high confidence and strong support) in large databases. • Mining association rules is composed of two steps: TID Items 1000 A, B, C 2000 A, C 3000 A, D 4000 B, E, F 1. discover the large items, i.e., the sets of itemsets that have transaction support above a predetermined minimum support s. 2. Use the large itemsets to generate the association rules MinSup = 2

  13. Algorithm Apriori: Illustration C1 F1 Database D S = 2 Itemset Count {A} {B} {C} {D} {E} Itemset Count {A} 2 {B} 3 {C} 3 {E} 3 TID Items 100 A, C, D 200 B, C, E 300 A, B, C, E 400 B, E 2 Scan D 3 3 1 3 C2 C2 F2 {A, B} {A, C} {A, E} {B, C} {B, E} {C, E} Itemset {A,B} {A,C} {A,E} {B,C} {B,E} {C,E} Itemset Count {A, C} 2 {B, C} 2 {B, E} 3 {C, E} 2 Itemset Count 1 Scan D 2 1 2 3 2 C3 C3 F3 Scan D {B, C, E} Itemset {B, C, E} 2 {B, C, E} 2 Itemset Count Itemset Count

  14. Representative Association Rules • Definition 1. Cover C of a rule X  Y is denoted by C(X  Y) and defined as follows: C(X  Y) = { [X  Z]  V : Z, V are disjoint subsets of Y}. • Definition 2. Set RR(s, c) of Representative Association Rules is defined as follows: RR(s, c) = {r  AR(s, c): ~(rl AR(s, c)) [rl r & r  C(rl)]} s – threshold for minimum support c – threshold for minimum confidence • Representative Rules (informal description): [as short as possible]  [as long as possible]

  15. Representative Association Rules Transactions: {A,B,C,D,E} {A,B,C,D,E,F} {A,B,C,D,E,H,I} {A,B,E} {B,C,D,E,H,I} Find RR(2,80%) Representative Rules From (BCDEHI): {H}  {B,C,D,E,I} {I}  {B,C,D,E,H} From (ABCDE): {A,C}  {B,D,E} {A,D}  {B,C,E} Last set: (BCDEHI, 2)

  16. Questions? Thank You

More Related