740 likes | 1.04k Views
Collaborative Filtering. Rong Jin Dept. of Computer Science and Engineering Michigan State University. Information Filtering. Basic filtering question: Will user U like item X ? Two different ways of answering it Look at what U likes characterize X content-based filtering
E N D
Collaborative Filtering Rong Jin Dept. of Computer Science and Engineering Michigan State University
Information Filtering • Basic filtering question: Will user U like item X? • Two different ways of answering it • Look at what U likes characterize Xcontent-based filtering • Look at who likes X characterize Ucollaborative filtering
Collaborative Filtering(Resnick et al., 1994) Make recommendation decisions for a specific user based on the judgments of users with similar interests.
Collaborative Filtering(Resnick et al., 1994) Make recommendation decisions for a specific user based on the judgments of users with similar interests.
A General Strategy(Resnick et al., 1994) • Identify the training users that share similar interests as the test user. • Predict the ratings of the test user as the average of ratings by the training users with similar interests
A General Strategy(Resnick et al., 1994) • Identify the training users that share similar interests as the test user. • Predict the ratings of the test user as the average of ratings by the training users with similar interests 5
Important Problems in Collaborative Filtering • How to estimate users’ similarity if rating data is sparse? • Most users only rate a few items • How to identify interests of a test user if he/she only provides ratings for a few items? • Most users are inpatient to rate many items • How to combine collaborative filtering with content filtering? • For movie ratings, both the content information and the user ratings are available
Problem I: How to Estimate Users’ Similarity based on Sparse Rating Data?
Sparse Data Problem(Breese et al., 1998) Most users only rate a small number of items and leave most items unrated
Flexible Mixture Model (FMM) (Si & Jin, 2003) • Cluster training users of similar interests
Flexible Mixture Model (FMM) (Si & Jin, 2003) • Cluster training users of similar interests • Cluster items with similar ratings
Movie Type I Flexible Mixture Model (FMM) (Si & Jin, 2003) Movie Type II Movie Type III • Unknown ratings are gone!
Movie Type I Flexible Mixture Model (FMM) (Si & Jin, 2003) Movie Type II Movie Type III • Introduce rating uncertainty • Unknown ratings are gone! • Cluster both users and items simultaneously
Zu: user class Zo: item class U: user O: item R: rating Cluster variable Observed variable Zu Zo U O R Flexible Mixture Model (FMM) (Si & Jin, 2003) An Expectation Maximization (EM) algorithm can be used for identifying clustering structure for both users and items
Rating Variance (Jin et al., 2003a) • The Flexible Mixture Model is based on the assumption that users of similar interests will have similar ratings for the same items • But, different users of similar interests may have different rating habits
Rating Variance (Jin et al., 2003a) • The Flexible Mixture Model is based on the assumption that users of similar interests will have similar ratings for the same items • But, different users of similar interests may have different rating habits
Rating Variance (Jin et al., 2003a) • The Flexible Mixture Model is based on the assumption that users of similar interests will have similar ratings for the same items • But, different users of similar interests may have different rating habits
Decoupling Model (DM)(Jin et al., 2003b) Zu: user class Zo: item class U: user O: item R: rating Zo Zu R U O Hidden variable Observed variable
Decoupling Model (DM) (Jin et al., 2003b) Zu: user class Zo: item class U: user O: item R: rating Zo Zu Zpref Zpref: whether users like items R U O Hidden variable Observed variable
Decoupling Model (DM) (Jin et al., 2003b) Zu: user class Zo: item class U: user O: item R: rating ZR Zo Zu Zpref Zpref: whether users like items ZR: rating class R U O Hidden variable Observed variable
Decoupling Model (DM) (Jin et al., 2003b) Zu: user class Zo: item class U: user O: item R: rating ZR Zo Zu Zpref Zpref: whether users like items ZR: rating class R U O Hidden variable Observed variable
Empirical Studies • EachMovie dataset: • 2000 users and 1682 movie items • Avg. # of rated items per user is 130 • Rating range: 0-5 • Evaluation protocol • 400 training users, and 1600 testing users • Numbers of items rated by a test user: 5, 10, 20 • Evaluation metric: MAE • MAE: mean absolute error between true ratings and predicted ratings • The smaller the MAE, the better the performance
Baseline Approaches • Ignore unknown ratings • Vector similarity (Breese et al., 1998) • Fill out unknown ratings for individual users with their average ratings • Personal diagnosis (Pennock et al., 2000) • Pearson correlation coefficient (Resnick et al., 1994) • Only cluster users • Aspect model (Hofman & Puzicha, 1999)
Summary • The sparse data problem is important to collaborative filtering • Flexible Mixture Model (FMM) is effective • Cluster both users and items simultaneously • Decoupling Model (DM) provides additional improvement for collaborative filtering • Take into account rating variance among users of similar interests
Problem II:How to Identify Users’ Interests based on A Few Rated Items?
Identify Users’ Interests • To identify the interests of a user, the system needs to ask the user to rate a few items • Given a user is only willing to rate a few items, which one should be asked to solicit rating?
Identify Users’ Interests • To identify the interests of a user, the system needs to ask the user to rate a few items • Given a user is only willing to rate a few items, which one should be asked to solicit rating?
Identify Users’ Interests • To identify the interests of a user, the system needs to ask the user to rate a few items • Given a user is only willing to rate a few items, which one should be asked to solicit rating?
Identify Users’ Interests • To identify the interests of a user, the system needs to ask the user to rate a few items • Given a user is only willing to rate a few items, which one should be asked to solicit rating?
Identify Users’ Interests • To identify the interests of a user, the system needs to ask the user to rate a few items • Given a user is only willing to rate a few items, which one should be asked to solicit rating?
Active Learning Approaches(Ross & Zemel, 2002) • Selective sampling • Ask a user to rate the items that are most distinguishable for users’ interests • A general strategy • Define a loss function that represents the uncertainty in determining users’ interests • Choose the item whose rating will result in the largest reduction in the loss function
Active Learning Approach (I)(Jin & Si, 2004) • Select the items that have the largest variance in the ratings by the most similar users
Active Learning Approach (II) (Jin & Si, 2004) • Consider all the training users when selecting items • Weight training users by their similarities when computing the “uncertainty” of items
A Bayesian Approach for Active Learning (Jin & Si, 2004) • Flexible Mixture Model • Key is to determine the user class for a test user • Let D be the ratings already provided by test user y • D = {(x1, r1), …, (xk, rk)} • Let be the distribution of user class for test user y estimated based on D • = {z = p(z|y)|1z m}
A Bayesian Approach for Active Learning (Jin & Si, 2004) • When the user class distribution true of the test user is given, we will select the item x* that
A Bayesian Approach for Active Learning (Jin & Si, 2004) • When the user class distribution true of the test user is given, we will select the item x* that • x,r be the distribution of user class for test user y estimated based on D + (x,r)
A Bayesian Approach for Active Learning (Jin & Si, 2004) • When the user class distribution true of the test user is given, we will select the item x* that • x,r be the distribution of user class for test user y estimated based on D + (x,r) • Take into account the uncertainty in rating prediction
Two types of uncertainties • Uncertainty in user class distribution • Uncertainty in rating prediction A Bayesian Approach for Active Learning (Jin & Si, 2004) • But, in reality, we never know the true user class distribution trueof the test user • Replace true with the distribution p(|D)
Computational Issues • Estimating p(|D) is computationally expensive • Calculating the expectation is also expensive
Approximate Posterior Distribution (Jin & Si, 2004) • Approximate p(|D) by Laplacian approximation • Expand the log-likelihood function around its maximum point *
Compute Expectation (Jin & Si, 2004) • Expectation can be computed analytically using the approximate posterior distribution p(|D)
Empirical Studies • EachMovie dataset • 400 training users, and 1600 test users • For each test user • Initially provides 3 rated items • 5 iterations, and 4 items are selected for each iteration • Evaluation metric • Mean Absolute Error (MAE)
Baseline Approaches • The random selection method • Randomly select 4 items for each iteration • The model entropy method • Select items that result in the largest reduction in the entropy of distribution p(|D) • Only considers the uncertainty in model distribution • The prediction entropy method • Select items that result in the largest reduction in the uncertainty of rating prediction • Only considers the uncertainty in rating prediction
Summary • Active learning is effective for identifying users’ interests • It is important to take into account every bit of uncertainty when applying active learning methods
Problem IIIHow to Combine Collaborative Filtering with Content Filtering?
Linear Combination (Good et al., 1999) • Build a different prediction model for content information and collaborative information • Linearly combine their predictions together