360 likes | 687 Views
Market Basket & Advanced Analytics at Dunkin Brands. Mahesh Jagannath, Prasanna Palanisamy Oct 1, 2014. Agenda. About Dunkin Brands Inc. BI Program at Dunkin Brands BI Architecture at Dunkin Brands Advanced Analytics Architecture & Methodology Advanced Analytics Use Cases at Dunkin
E N D
Market Basket & Advanced Analytics at Dunkin Brands Mahesh Jagannath, Prasanna Palanisamy Oct 1, 2014
Agenda • About Dunkin Brands Inc. • BI Program at Dunkin Brands • BI Architecture at Dunkin Brands • Advanced Analytics Architecture & Methodology • Advanced Analytics Use Cases at Dunkin • Market Basket • Customer Analytics • Q & A
Disclaimer All data used is sample data for presentation purposes only and is not actual corporate sales or consumer data
BI Program At Dunkin Brands • First launched at DBI in 2007 • 1350 BI users today with role based access to 504 dashboard pages • Mature governance process • Domestic POS sales analysis to increase comparable store sales and profitability of DD and BR in U.S. • Store development dashboards to identify opportunities to continue DD U.S. contiguous store expansion • International reported sales analysis to drive accelerated international growth across both brands.
BI/DW Architecture at Dunkin Brands Other DBI Data R Exadata Exalytics Hyperion Users Enterprise Data Warehouse Oracle BI Oracle EBS Hyperion DBI Corporate Users Intl. POS Franchisees (above store) Social Media Radiant Sales Data Loyalty / CRM Steton SMG PAR RPS Bluecube PAR Terminals RPS Archive
Agenda • About Dunkin Brands Inc. • BI Program at Dunkin Brands • BI Architecture at Dunkin Brands • Advanced Analytics Architecture & Methodology • Advanced Analytics Use Cases at Dunkin • Market Basket • Customer Analytics • Q & A
Advanced Analytics platform • Products Considered • Oracle Advanced Analytics / Oracle R Enterprise (ORE) • Open Source R • IBM SPSS • Chose Oracle Advanced Analytics • Excellent fit with existing analytics infrastructure • All the benefits of Open source R • Scalability of Oracle 11G on engineered systems
R—Widely PopularR is a statistics language similar to Base SAS or SPSS statistics • R environment Strengths • Powerful & Extensible • Graphical & Extensive statistics • Free—open source Challenges • Memory constrained • Single threaded • Outer loop—slows down process • Not industrial strength
Oracle Advanced Analytics Oracle R Enterprise Component Architecture
Oracle Advanced AnalyticsOracle R Enterprise Compute Engines
Agenda • About Dunkin Brands Inc. • BI Program at Dunkin Brands • BI Architecture at Dunkin Brands • Advanced Analytics Architecture & Methodology • Advanced Analytics Use Cases at Dunkin • Market Basket • Customer Analytics • Q & A
Market Basket Analysis • Understand role of category and purchase behavior • Identify category marketing opportunities • Get richer insight into behavioral changes from promotions • Apply data validation rules • Transform POS data into MB input format • Output to Star schema suitable for OBIEE consumption • Pairwise association model similar to Apriori, custom SQL implementation
Market Basket Business Questions Choose a Category: (Sub Category Level) Answer the following questions for that Item in a particular region last week. • What % of all transactions include [Product]? • What related items are sold most frequently with [Product]? • What is the average ticket $ amount when [Product] is present? • On Average how many [Product] are sold in each transaction? • What beverages are consumers buying most with [Product]? • In what % of [Product] transactions is [Product] the only product purchased?
Data Analysis & Design Considerations • 8 M daily transactions, ~25M transaction detail lines • 20 TB data warehouse size, sales data about 10 TB • Hierarchies: 5 level Product, 2x4 level Org, 4 level regional ~1000 SKUs @Item Group/Size level • Exponential growth in combinations with each hierarchy • 2 years of pre-computed Market Baskets and associated sales measures for reporting • Nightly compute within ETL windowdata with 1 day latency • Measures are non-additive along the Product Hierarchy
Design : Approaches considered • Use Oracle Data Mining / Oracle R Enterprise Association Rules • Use Frequent Itemset table function in Oracle 11g to compute Item-sets • Custom SQL Development • Approach Chosen • Oracle Advanced Analytics for exploration / Ad-Hoc • Custom SQL for repeatable basket computation • OBIEE for reporting
High-level Design Rule Development Transaction Data UI / Reports Data Model/ Pre-processing Measure Calculation
4 Key Reports % of Transactions containing related items Single Item Transactions: % of transactions when products are purchased alone. Transaction Detail: Product of Interest Related Product Pairings 20
Related Item What beverages are sold most often with PM Flats?
POI Transaction Detail Transaction Detail: Product of Interest 22
Related Purchases Related Product Pairings 23
Related Transactions Non-additive measures 5+3+3 Don’t Equal 11 in this case because some medium and small coffees might be sold in the same transaction!
Single Item Transactions Click on to drill down for more detail
Agenda • About Dunkin Brands Inc. • BI Program at Dunkin Brands • BI Architecture at Dunkin Brands • Advanced Analytics Architecture & Methodology • Advanced Analytics Use Cases at Dunkin • Market Basket • Customer Analytics • Q & A
Current Areas Of Interest Customer Profiling Clustering / Segmentation Customer Churn Prediction Targeted Promotions
Customer Profiling • Compute behavioral variables • Create Customer record • Data Exploration in R
Customer Profiling: Attributes List of customer attributes used as-is or derived from their transactional history
Customer Segmentation / Clustering • To understand your customers • Targeted Marketing • Design Promotions • Re-run the model periodically to update the new clusters • Indicates any shift in the customer behavior • Compute behavioral variables • Create Customer record • Data Exploration in R • Model displays cluster means – Cluster properties • Number of Customers in a cluster • Deployed for targeted Marketing and Monitoring Customer behavior • Identify variables for clustering, • Normalize data for Clustering • K-Means Clustering used to cluster Customers and find individual cluster characteristics
Customer Segmentation / Clustering Analyze Cluster means to Derive Cluster Properties • Regulars – avg weekly visits are 5 • 78.2% visits in morning • Mostly coffee drinker, but 25% times food buyers Clustering Algorithm • Coffee Regulars • Avgweekly visits are 5.45 • Avg coffee transactions 80.29% Customer Data Profiles • High Spenders, Frequent visitors • Avgweekly spend ($35.12) • Avg. weekly visits (7.44) • Coffee and Food in basket (Avg items per transaction 2.4
Customer Churn Analysis • Define Churn & Active Customer • Identify Churn Customer patterns • Is the churn pattern localized or National? • Monitor the response and re-calibrate by updating training data or model parameters • Calculate the metrics for model evaluation • Compute behavioral variables • Create Customer record • Data Exploration in R • Model will calculate the churn score for existing customers • Flag customers with high risk, low risk based on churn score • Create Training data set • Should have equal distribution of churn and usual customers • Test the model on test data set, for which outcome is known • Select threshold for model selection • Confusion Matrix for the best Model • Model to derive churn risk score. • SVM • Logistic regression • Naïve Bayes Classifier
Possible Future initiatives • Periodic Churn Rate Modeling – measure churn over time • Customer Segments based on buying pattern – what they buy, when they buy? • Identify customers who are more likely to respond to offers • Personalized promotions for retention • Customer Lifetime value • Customer Sentiment Analysis • Enrich customer profiles with modeling scores