130 likes | 143 Views
This research paper presents a method for recommending grocery products based on the current shopping context and user preferences. It introduces a basket-sensitive random walk model that explores higher-order neighborhood information and uses a personalized vector to incorporate context information. The performance of the proposed model is compared with other recommendation models using different metrics and data sets.
E N D
The 15th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Paris, 2009 Grocery Shopping Recommendation Based on Basket-Sensitive Random Walk Ming Li(1), Ben Dias(1), Wael EI-Deredy(2), Ian Jarman(3) and Paulo Lisboa(3) Unilever Discover (Colworth), UK(1) University of Manchester, UK(2) Liverpool John Moore University, UK(3)
The Problem Method Performance Metric Experiment Summary The Problem • Grocery Shopping Recommendation: • Grocery shopping considered as a real drudgery • High repeat purchase rate with low injection of new products • Implicit product preference feedback • recommendation based on current context • Three Issues in Recommendation Model • A method to derive user-wise or product-wise similarity • A method to generate recommendation based on those similarities • An evaluation strategy to regulate model via retrospective data for optimal live performance
The Problem Method Performance Metric Experiment Summary Prior Art • Item-Item Collaborative Filtering • memory efficient (e.g. 54GB -> 24GB -> 9MB) • subject to the sparsity problem • only exploits direct (i.e. first-order) neighbourhood information • Movie/Book Recommendation via Random Walk Model • alleviate the sparsity problem by exploring transitive (i.e. high-order) neighbourhood information: • Two concerns: • definition of the transition probability (column-normalized similarity) • ranking is insensitive to the current context
The Problem Method Performance Metric Experiment Summary p1 p2 p3 Products #purchase f(1,1) f(*,*) f(4,2) … … c1 c2 c3 c4 Consumers Proposed: Basket-Sensitive Random Walk • Define the product transition probability via a bipartite network: • i.e. first–order similarity • α penalization of consumers (products) with too many transactions • Subject to data sparsity • Enforce similarity from higher-order information but with reference to a user basket U: • introduce context information by Ubasket (i.e. personalization vector) • 1-d controls the bias between current and past baskets
The Problem Method Performance Metric Experiment Summary Proposed: continue • Basket-Sensitive Random Walk Model • Straightforward implementation is infeasible • Quick approximation of by • : offline calculation (also called ‘random walk with restart’) • lead to same ordered list of recommendation
The Problem Method Performance Metric Experiment Summary weighted Hit Rate via leave-one-out split binary Hit Rate via popularity based split micro-averaged HitRate via leave-one-out split macro-averaged HitRate via leave-one-out split Performance Metric Characteristics of performance metrics basket oriented product oriented bias toward most popular products bias toward least popular products
The Problem Method Performance Metric Experiment Summary Experiments • Three real grocery data sets • One from the collaborator • online grocery store www.Leshop.ch • Two from other published research works • membership retailer warehouse (Chun-Nan Hsu et al., JML04) • anonymized retail store (Tom Brijs et al., KDD99) • Several performance metric with different characteristics • Experiments • Impact of model parameters • Comparison with other models • data sparsity • personalization
The Problem Method Performance Metric Experiment Summary Impact of model parameters: α and d Performance metric: Left: bHR(pop) Right: marcroHR Model parameters: α: bigger value indicating stronger penalization of products with too many transactions 1-d : bias toward current basket Observation: Inconsistence between bHR and macroHR in some data sets (macroHR gets stronger bias toward least popular products)
The Problem Method Performance Metric Experiment Summary Comparison with other models Performance metric: bHR(pop) and bHR(rnd) : binary hit using popularity-based split and random-split • Observations: • Empirical advantage of network-based similarity over other metric-based ones • Performance overestimation by random-based basket split • an appropriate performance metric need to be determined by business rule, e.g. bHR(pop) for grocery products and bHR(rnd) for movies
The Problem Method Performance Metric Experiment Summary Experiments on data sparsity Performance metric: bHRand wHR (weighted hit rate via leave-one-out) Models: CF(bn) CF(bn)+BSRW CF(cp)+BSRW • Observations: • The two proposed models have similar performance in bHR • The performance difference in wHR is more pronounced with increased data sparsity • attribute to the high-order similarities introduced by BSRW scheme
The Problem Method Performance Metric Experiment Summary Personalized Models Personalisation can better meet the consumer’s requirement and one simple way to achieve this is by re-arranging the ordered recommendation list according to personal preference • Observations: • Empirical advantage of network-based similarity over other metric-based ones • Performance overestimation by random-based basket split • an appropriate performance metric need to be determined by business rule, e.g. bHR(pop) for grocery products and bHR(rnd) for movies
The Problem Method Performance Metric Experiment Summary Summary • Grocery shopping recommendations • product preferences are implicit • repeated purchases are overwhelmingly more frequent than purchases of new products • Basket-Sensitive Random Walk Model (BSRW) • Derives product transition probability via network-based similarity instead of normalizing ad-hoc metric-based ones • On-line adaptation of recommendation based on current basket • Poster in the Tuesday night poster session
The Problem Method Performance Metric Experiment Summary Acknowledgements • Dominique Locher and his team at LeShop • www.LeShop.ch : the No.1 e-grocer of Switzerland since 1998 • Good Friends, Collaborators and Data Provider • All the other data providers and anonymous reviewers