250 likes | 414 Views
Customers Value Optimization in Digital Marketing. Georgios Theocharous (Adobe Research) Mohammad Ghavamzadeh ( Adobe Research & INRIA Lille ) Shie Mannor ( Technion ). Adobe’s Marketing Cloud. Plan and execute orchestrated campaigns across all channels.
E N D
Customers Value Optimization in Digital Marketing GeorgiosTheocharous(Adobe Research) Mohammad Ghavamzadeh (Adobe Research &INRIA Lille) ShieMannor(Technion)
Adobe’s Marketing Cloud Plan and execute orchestrated campaigns across all channels Organize, manage, and deliver creative assets and other content across digital marketing channels (web site management) Manage, forecast, and optimize your media mix to deliver peak return on your investment • Manage social content in social networks • Listen and respond to customer conversations in real time • Create social campaigns Automated decision-making and targeting Real-time web, social and mobile analytics 7 out of 10 dollars transacted on the web pass through Adobe products
Outline • Problem • Need for lifetime optimization • Research challenges • Solutions
Problem: Sequential Decision Making under Uncertainty • STATE OBSERVATIONS • User Demographics • Recency • Frequency • Monetary • ACTION • Display offers What is the optimal marketing strategy? • REWARD • Clicks MARKETING AGENT
Problem: Algorithms • Myopic • (State of the art ): Offers shown now are agnostic about the future Disp. offer Disp. offer Disp. offer X X X • LTV • (New solution): • Offers shown now consider the impact on future offers Disp. offer Disp. offer Disp. offer X X X t2 t1 t0 (now) Visitor X (represented by behavioral and contextual features)
Need for LTV • 3 722 329 visits in 90 days • 1 867 916 visitors • 28.53 % visitors are recurring • 49.81 % of all visits are recurring visits
Need for LTV • 41.70 % of all conversions are happening in a recurring visit • 3.96 % of the buying visitors buy again
Challenge 1: Off-policy evaluation • Ideas: • Importance Sampling: But has high variance • Simulator: Hard to capture the true dynamics of a noisy and non-stationary world A NEW POLICY Π VALUE OF POLICY Π REAL TRAJECTORIES FROM POLICY IN PRODUCTION Β
Challenge 2: Evaluating a simulator • Ideas: • Error in predicting next state • Performance statistics… similarity of expected rewards in real and simulated data (unknown or random behavior policy) • Bound the error • … SIMULATOR SCORE REAL TRAJECTORIES
Challenge 3: Robust Optimization • Ideas: • Pessimistic solutions TRAJECTORIES FROM MULTIPLE PERIOD SEGMENTS ROBUST POLICY
Challenge 4: Scaling up • Ideas: • Hadoop REAL TRAJECTORIES 100’s features RL POLICY
Challenge 5: Online Versions • Ideas: • Turn FQI into an online or batch updates ONE SAMPLE RL POLICY UPDATE
Challenge 6: Learning the Right Representation • Ideas: • Dimensionality reduction TRAJECTORIES 10’s features REAL TRAJECTORIES 1000’s features
Challenge 7: Policy Visualizations • Ideas: • Graph analysis TRAJECTORIES VISUALIZATION
Challenge 8: Progressively More Engaging Interactions • Ideas: • Activity learning TRAJECTORIES HIERARCHICAL MDPs
Research challenges: Hierarchical Representations: Home Innovations Search FIND A CAR OWNER Explore BMW Models Certified Pre-own My BMW Owners LEARN ABOUT BMW Build Your Own Accessories TEST DRIVE PAYMENT OPTIONS Test Drive Dealer Locator Financial Services Sales & Programs
Solutions: Data Sets Explored • Data set 1 • zeros: 10256 (82%) • ones: 1575 (13%) • more than one success : 636 (5%) • at least one success: 2211 (18%) • number of episodes: 12467 • number of interactions: 602755 • Data set 2 • zeros: 18472 (87%) • ones: 2218 (10%) • more than one success : 539 (3%) • at least one success: 2757 (13%) • number of episodes: 21229 • number of interactions: 635079 • Data set 3 • zeros: 1568 (65%) • ones: 446 (18%) • more than one success : 417 (17%) • at least one success: 863 (35%) • number of episodes: 2431 • number of interactions: 182257
Solutions: Experimental Setup Time-series Data (S,A,R,S’) TRAINING TESTING Policy Simulator Learn to predict each feature for each action, using system Identification techniques Various LTV and myopic strategies
Solutions: RL algorithms used • We use various state of the art batch RL algorithms. We computed optimally and myopic solutions: • Kernel Based RL on representative states • K-means RL • Fitted Q iteration (FQI ) • FQI-sarsa • Random
Solutions: Simulator Constant (e.g., a global interest) Remains the same (e.g., demographics) Constant increment (e.g., cum action counter) Same as another feature (e.g., interest) Increment with multiple of a future random variable (e.g., cum success =cum success + reward) Counts binary events until reset (e.g., success recency) Random variable (Visit time recency: sample empirically) Everything else (predict with regression or classification)
Results 20% improvement With LTV users convert faster With LTV users engage longer
Schedule 08:45 - 09:00 Introduction 09:00 - 10:00 Craig Boutilier - University of Toronto (Invited Talk) 10:00 - 10:20 Andres Munoz Medina - New York University 10:20 - 10:40 Coffee Break (Poster Session) 10:40 - 11:40 John Langford - Microsoft Research (Invited Talk) 11:40 - 12:00 Bruno Scherrer - INRIA Nancy Lunch 14:00 - 15:00 Shie Mannor - Technion (Invited Talk) 15:00 - 15:20 Mohammad Ghavamzadeh - Adobe Research & INRIA Lille 15:20 - 15:40 Coffee Break (Poster Session) 15:40 - 16:40 Esteban Arcaute - Walmart Labs (Invited Talk) 16:40 - 17:00 Philip Thomas - University of Massachusetts Amherst