160 likes | 243 Views
Elective-I. Examination Scheme- In semester Assessment: 30 End semester Assessment :70. Data Mining Techniques and Applications. Text Books: Data Mining Concepts and Techniques- Micheline Kamber Introduction to Data Mining with case studies- G.k.Gupta. Reference Books:
E N D
Elective-I Examination Scheme- In semester Assessment: 30 End semester Assessment :70 Data Mining Techniques and Applications Text Books: Data Mining Concepts and Techniques- Micheline Kamber Introduction to Data Mining with case studies-G.k.Gupta Reference Books: Mining the Web Discovering Knowledge from Hypertext data- Saumencharkrobarti Reinforcement and systemic machine learning for decision making- ParagKulkarni
Unit-2) Concepts of frequent patterns,Associations and Correlation.. • Market Basket Analysis • Frequent item set, Closed item set, Association Rules • Mining multilevel Association Rules • Constraint based association rule mining • Apriori Algorithm • FP growth Algorithm
Some Definitions • Itemset: Transaction is a set of items (Itemset). • Confidence: It is the measure of trust worthiness associated with each discovered pattern. • Support : It is the measure of how often the collection of items in an association occur together as percentage of all transactions • Frequent itemset : If an itemset satisfies minimum support,then it is a frequent itemset.
Market Basket Analysis: • Def: Market Basket Analysis (Association Analysis) is a mathematical modeling technique based upon the theory that if you buy a certain group of items, you are likely to buy another group of items. • It is used to analyze the customer purchasing behavior and helps in increasing the sales and maintain inventory by focusing on the point of sale transaction data.
Market Basket Analysis: • identify purchase patterns • what items tend to be purchased together • obvious: steak-potatoes; diaper- baby lotion • what items are purchased sequentially • obvious: house-furniture; car-tires • what items tend to be purchased by season
Continude.. • Categorize customer purchase behavior • purchase profiles • profitability of each purchase profile • Use it for marketing • layout or catalogs • select products for promotion • space allocation
Possible Market Baskets Customer 1: beer, pretzels, potato chips, aspirin Customer 2: diapers, baby lotion, grapefruit juice, baby food, milk Customer 3: soda, potato chips, milk Customer 4: soup, beer, milk, ice cream Customer 5: soda, coffee, milk, bread Customer 6: beer, potato chips
Purchase Profiles beauty conscious kids’ play convenience food health conscious pet lover women’s fashion sports conscious gardener kid’s fashion smoker automotive hobbyist casual drinker photographer student/home office new family tv/stereo enthusiast illness (prescription) illness over-the-counter seasonal/traditional personal care casual reader homemaker home handyman home comfort men’s image conscious fashion footwear sentimental men’s fashion
Purchase Profiles • Beauty conscious • cotton balls • hair dye • cologne • nail polish
Market Basket Analysis • BENEFITS: • simple computations • can be undirected (don’t have to have hypotheses before analysis) • different data forms can be analyzed
Definition: Frequent Itemset • Itemset: • A collection of one or more items • Example: {Milk, Bread, Diaper} • Support count () • Frequency of occurrence of an itemset • E.g. ({Milk, Bread, Diaper}) = 2 • Support • Fraction of transactions that contain an itemset • E.g. s({Milk, Bread, Diaper}) = 2/5 • Frequent Itemset • An itemset whose support is greater than or equal to a minsup threshold
Association Rule Mining • Given a set of transactions, find rules that will predict the occurrence of an item based on the occurrences of other items in the transaction Market-Basket transactions Example of Association Rules {Diaper} {Beer},{Milk, Bread} {Eggs,Coke},{Beer, Bread} {Milk}, Implication means co-occurrence..
Mining Association Rules Example of Rules: {Milk, Diaper} {Beer} (s=0.4, c=0.67){Milk, Beer} {Diaper} (s=0.4, c=1.0) {Diaper, Beer} {Milk} (s=0.4, c=0.67) {Beer} {Milk, Diaper} (s=0.4, c=0.67) {Diaper} {Milk, Beer} (s=0.4, c=0.5) {Milk} {Diaper, Beer} (s=0.4, c=0.5) • Observations: • All the above rules are binary partitions of the same itemset: {Milk, Diaper, Beer} • Rules originating from the same itemset have identical support but can have different confidence • Thus, we may decouple the support and confidence requirements