490 likes | 776 Views
CoBaFi : Collaborative Bayesian Filtering. Alex Beutel Joint work with Kenton Murray, Christos Faloutsos , Alex Smola April 9, 2014 – Seoul, South Korea. Online Recommendation. Movies. 5. 5. 2. Users. 5. 3. 5. Online Rating Models. Online Rating Models. Reality.
E N D
CoBaFi:Collaborative Bayesian Filtering Alex Beutel Joint work with Kenton Murray, Christos Faloutsos, Alex Smola April 9, 2014 – Seoul, South Korea
Online Recommendation Movies 5 5 2 Users 5 3 5
Online Rating Models Reality Normal Collaborative FilteringFit a Gaussian - Minimize the error Minimizing error isn’t good enough - Understanding the shape matters!
Online Rating Models Normal Collaborative FilteringFit a Gaussian - Minimize the error Our Model
Our Goals and Challenges • Given: A matrix of user ratings • Find: A model that best fits and predicts user preferences • Goals: • G1. Fit the recommender distribution • G2. Understand users who rate few items • G3. Detect abnormal spam behavior
1. Background Outline 2. Model Formulation 3. Inference 4. Catching Spam 5. Experiments
Collaborative Filtering [Background] Movies V Users X U ≈ Genres 5 = 5 =
Bayesian Probabilistic Matrix Factorization (Salakhutdinov& Mnih, ICML 2008) μU ~ [Background] …
1. Background Outline 2. Our Model 3. Inference 4. Catching Spam 5. Experiments
Our Model Cluster users (& items) Share preferences within clusters Use user preferences to predict ratings
The Recommender Distribution First introduced by Tan et al, 2013 Linear Normalization Quadratic Normalization θ1 = 0 Vary θ2 θ2= 0.4 θ2= -1.0
The Recommender Distribution ui Genre Preferences General Leaning How Polarized • Goal 1: Fit the recommender distribution
Understanding varying preferences 5 2 5 3 1 1 5
Finding User Preferences μU μU’ • Goal 2: Understand users who rate few items
Chinese Restaurant Process μ1 μ3 μ2
1. Background Outline 2. Our Model 3. Inference 4. Catching Spam 5. Experiments
Gibbs Sampling - Clusters [Details] Probability of picking a cluster = Probability of a cluster based on size (CRP) x Probability uiwould come from the cluster
Sampling user parameters [Details] Probability of user preferences ui = Probability of preferences ui given cluster parameters x Probability of predicting ratings ri,jusing new preferences Recommender distribution is non-conjugate Can’t sample directly!
1. Background Outline 2. Our Model 3. Inference 4. Catching Spam 5. Experiments
Review Spam and Fraud 1 5 5 5 1 1 5 1 1 5 1 1 5 1 1 5 Image from http://sinovera.deviantart.com/art/Cute-Devil-117932337
Clustering Fraudsters μ3 μ1 μ2 New Spam Cluster Previous “Real” Cluster
Clustering Fraudsters μ3 μ1 μ2 Too much spam – get separated into “fraud” cluster Trying to “hide” just means (a) very little spam or (b) camouflage reinforcing realistic reviews.
Clustering Fraudsters μ4 μ1 μ3 μ2 μ5 Naïve Spammers Spam + Noise Hijacked Accounts • Goal 3: Detect abnormal spam behavior
1. Background Outline 2. Our Model 3. Inference 4. Catching Spam 5. Experiments
Does it work? Better Fit
Catching Naïve Spammers Injection 83% are clustered together
Clustered Hijacked Accounts Clustered hijacked accounts Clustered “attacked” movies Injection
Shape of Netflix reviews More Skewed More Gaussian
Shape of Amazon Clothing reviews Nearly all are heavily polarized!
Shape of Amazon Electronics reviews Nearly all are heavily polarized!
Shape of BeerAdvocate reviews Nearly all are Gaussian!
Hypotheses on shape of data vs. • Hard to evaluate beyond binary • Selection bias – Only committed viewers watch Season 4 of a TV series • Hard to compare value across very different items. • Lots of beers and movies to compare • Fewer TV shows • Even fewer jeans or hard drives
Key Points • Modeling: Fit real data with flexible recommender distribution • Prediction: Predict user preferences • Anomaly Detection: When does a user not match the normal model?
Questions? Alex Beutel abeutel@cs.cmu.edu http://alexbeutel.com
Sampling Cluster Parameters μα Hyperparametersμα, λα, Wα, ν μa Priors on μα, λα, Wα u5 u6
Gibbs Sampling - Clusters [Details] Probability uiwould be sampled from cluster a Probability of a cluster (CRP)
Sampling user parameters [Details] Probability of uigiven cluster parameters Probability of predicting ratings ri,j Recommender distribution is non-conjugate Can’t sample directly! Use a Laplace approximation and perform Metropolis-Hastings Sampling
Sampling user parameters [Details] Use candidate normal distribution Mode of p(ui) “Variance” of p(ui) Metropolis-Hastings Sampling: Sample Keep new with probability
Sampling Cluster Parameters [Details] Users/Items in the cluster Priors
Inferring Hyperparameters [Details] Solved directly – no sampling needed! Prior hidden as additional cluster
Does Metropolis Hasting work? • Have to use non-standard sampling procedure: • 99.12% acceptance rate for Amazon Electronics • 77.77% acceptance rate for Netflix 24k
Does it work? Compare on Predictive Probability (PP) to see how well our model fits the data
Handling Spammers Random naïve spammers in Amazon Electronics dataset Random hijacked accounts in Netflix 24k dataset
Clustered Naïve Spammers 83% are clustered together
Clustered Hijacked Accounts Clustered hijacked accounts Clustered “attacked” movies