1 / 11

Association Rule

Association Rule. By Kenneth Leung. Data Mining. The process of extracting valid, previously unknown, comprehensible, and actionable information from large databases, and using it to make crucial business decisions. Make decision based on previous experience or observation.

trish
Download Presentation

Association Rule

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Association Rule By Kenneth Leung

  2. Data Mining • The process of extracting valid, previously unknown, comprehensible, and actionable information from large databases, and using it to make crucial business decisions. • Make decision based on previous experience or observation

  3. Association Rule Mining • Formal: To find interesting associations and/or correlation relationships among large set of data items. Association rules show attribute value conditions that occur frequently together in a given dataset. • Informal: “If – Then” relationship. If this happen, what is most likely to happen next. Obesity => Diabetes

  4. Market Basket Analysis A typical and widely-used example of association rule mining. Example: • Data are collected using bar-code scanners in supermarkets. • Each record will consist of all items in a single purchase transaction. • Managers would be interested to know if certain groups of items are consistently purchased together. • They could use this data for adjusting store layouts (placing items optimally with respect to each other), for cross-selling, for promotions, for catalog design and to identify customer segments based on buying patterns. 

  5. Famous & Interesting Finding • Beer & Diaper “A number of convenience store clerks noticed that men often bought beer at the same time they bought diapers. The store mined its receipts and proved the clerks' observations correct. So, the store began stocking diapers next to the beer coolers, and sales skyrocketed”

  6. Why beer and Diapers?? Moms are stressed out by their naughty babies, and they need some beers for relief? Diapers boxes for putting old beer bottles. Very environmental Friendly, and easy handling.

  7. Two Certainty Indices • Determine whether a rule is good • Support of AR: percentage of transactions that contain X and Y (X and Y are two items) • Confidence of AR: Ratio of number of transactions that contain X and Y to the number that contain X • The higher, the more reliable.

  8. Example: Support • Supermarket has 100,000 transactions. • 2000/100,000 transactions include beer • 800/2000 transactions contain diapers • Support for the rule “beer->diapers” is 800 or 800/100,000 = 0.0008, or 0.8%

  9. Example: Confidence • Supermarket has 100,000 transactions. • 2000/100,000 transactions include beer • 800/2000 transactions contain item diapers • Confidence for the rule “beer->diapers” is 800/2000 = 0.4, or 40%

  10. Full example from Wiki • {Cold, Raining} => No • Support: 2/5 = 40% • Confidence: 2/2 = 100% • => Good • {Calm, Dry} => Yes • Support: 2/5 = 40% • Confidence: 2/2 = 100% • => Good • {Dry} => No • Support: 1/5 = 20% • Confidence: 1/3 = 33.3% • => Bad • {Windy} => No • Support: 0/5 = 0% • Confidence: 1/1 = 100% • =>Bad • {Cold, Raining} => No • {Calm, Dry} => Yes • {Dry} => No • {Windy} => No

  11. References • http://www.resample.com/xlminer/help/Assocrules/associationrules_intro.htm • http://en.wikipedia.org/wiki/Association_rule_learning • Dr Sin-Min Lee’s lecture 30

More Related