290 likes | 444 Views
Mining the Network Value of Customers. Zhenwei He & Cen Zhe Qiao School of Informatics University of Edinburgh. Outline. Introduction Modeling Markets as Markov random field Mining from Collaborative Filtering System(CFS) Example: - the EachMovie collaborative filtering database
E N D
Mining the Network Value of Customers Zhenwei He & CenZheQiao School of Informatics University of Edinburgh
Outline • Introduction • Modeling Markets as Markov random field • Mining from Collaborative Filtering System(CFS) • Example: - the EachMovie collaborative filtering database • Future work • Conclusion
Introduction • Mass Marketing • Direct Marketing: independent assumption • Viral Marketing: strongly dependent • Data mining: plays a key role • General framework • Optimize the choice of which customers to market to • Estimating what customer acquisition cost is justified for each
How to do that? • Modeling markets as Social Network • Mining the network from Collaborative Filtering Databases
Modeling Markets as Social Network • Some mathematical notations: n- the number of customers - if customer i buys the product/ ith-customer - set of neighbors of - the customers whose value is know(unknown) - the number of unknown neighbors of - set of attributes of the product - the marketing action that is taken for customer i C - the cost of marketing to a customer
Modeling Markets as Social Network r0 - the revenue from selling the product to customer if NO marketing action is performed. r1 - the revenue from selling the product to customer if marketing action is performed - the result of setting to 1 and leaving the rest of M unchanged - similar Where
Modeling Markets as Social Network • The customer’s network value = {the Customer’s TOTAL value} – {The customer’s INTRINSIC value} • The total value of customer is measured by Which is • The intrinsic value of customer is
Modeling Markets as Social Network • The global lift in profit: Where ri = r1 if Mi =1, ri = r0 otherwise, and |M| is the number of 1’s in M • The expected lift in profit:
Modeling Markets as Social Network • Our goal: - to find the assignment of values to M that maximizes ELP • Problem: - required trying all possible combinations of assignment! • Solution: - approximate procedures • Single Pass Methods • Greedy search • Hill-Climbing search
Modeling Markets as Social Network • There may be another problem. • How do we compute ? • L.Pelkwitz (1990), A continuous relaxation labeling for Markov Random fields • can be approximate by its maximum entropy estimate given the marginal
Modeling Markets as Social Network • expresses as a function of themselves • Can be iteratively to find them • Relaxation labeling : - guaranteed to converge to locally consistent values as long as the initial assignment is sufficiently close to them. • Initialization: the network-less probability • Problem: exponential in • Solution: Gibbs Sampling / k-shortest-path algorithm
Modeling Markets as Social Network • Recall: • still don’t know! • From Naïve Bayes: Where • Now can be computed by :
Mining the network from Collaborative Filtering Databases • : vary from application to application • Collaborative Filtering System: • Users rate a set of items (like: amazon.com) • These ratings are then used to recommend other items the user might be interested in • But…how? • The basic idea( given by GroupLens ): • To predict a user’s rating of an item as a weighted average of the rating given by similar users • Then recommend items with high predicted ratings
Mining the network from Collaborative Filtering Databases • The Pearson correlation coefficient: Whereis user i’s rating of item k, is the mean of user i’s ratings , likewise for j; and the summations and means are computed over the item k that both i and j have rated. • Given an item k that user I has not rated, the rating of k for the user is then predicted as: Where is a normalization factor, and is the set of users most similar to I according to PCC
Mining the network from Collaborative Filtering Databases • Thus we can compute : • Piecewise-linear model • Obtained by dividing ‘s range into bins • Compute Mean and for each bin • Estimate by interpolating linearly between the two nearest means • Finally for the model:
Example: the ‘EachMovie’ collaborative filtering database • ‘EachMovie’---word of mouth ---Rating ---Movie Information • The Data • Model Accuracy • Network Value • Marketing Experiments
The Model • Y={Y1,Y2,…,Y10} p(Y|Xi) • Pearson correlation coefficient for Wij (with penalized value 0.05)
The Data • Training set: all movies before Sep 1 1996 ---Sold before Jan 1996 ---Srecent Jan-Sep 1996 • Test set: movies Sep-Dec 1996 • Inactive people
Model Accuracy • Set M=M0 • Estimate the p(Xi|Xk, Y, M) • No rating from inactive people---p(Xi|Y)=0 • Correlation=p(Xi|Xk, Y, M)/actual Xi • Not really satisfactory as the genre is the only input
A good customer to market • Likely to give high rating • Strong weight to influence • Has many neighbors who are easily be influenced • High probability of purchasing
Marketing Experiments • Traditional direct marketing • Network-based marketing ---single pass ---greedy search ---hill climbing • Scenarios: Free Movie, Discounted Movie, Advertising
Profits and runtimes obtained using different marketing strategies
Related Work • Regarding the Netwotk ---Email logs (Schwartz and Wood) ---ReferralWeb ---MRF classification of Web pages(Chak) • Regarding the Marketing ---impact on the customers’ closest friends (Krackhardt)
Future Work • Expect larger network to be mined • Mining a network from multiple sources of relevant information • Mining the unknown networks • Towards more detailed node models and multiple types of relations between nodes
Conclusion • Data mining in viral marketing • Customers as nodes and impact on each other • social network from collaborative filtering database • Optimize marketing decision