320 likes | 336 Views
Discover how collaborative filtering techniques are utilized to predict user preferences and make personalized recommendations, such as user-based and item-based nearest neighbor methods. Gain insights on similarity measures like Jaccard and Cosine, along with Gaussian Radial Basis Function and Matrix Factorization. Explore the process of matrix factorization to group items based on latent features and optimize recommendation accuracy. Learn about algorithms like Alternating Least Squares for efficient computation.
E N D
CSE 482 Lecture 15 (Collaborative Filtering)
Outline • What is a recommender system? • What is collaborative filtering? • What are the collaborative filtering techniques?
Recommender Systems • Automated systems that make recommendation based on the preference of users • Motivation from User’s Perspective • Lots of online products, books, movies, etc. • Help me narrow the choices available… • Motivation from Business’ Perspective “ If I have 3 million customers on the web, I should have 3 million stores on the web.” CEO of Amazon.com
Collaborative Filtering • The technology behind most recommender systems • The process of filtering information by soliciting judgments from others to overcome the information overload problem • "Based on the premise that people looking for information should be able to make use of what others have already found and evaluated." (Maltz & Ehrlich, 1995)
Another Application: Netflix $1M Prize Task Given customer ratings on some movies Predict customer ratings on other movies If John rates “Mission Impossible” a 5 “Over the Hedge” a 3, and “Back to the Future” a 4, how would he rate “Harry Potter”, … ?
Collaborative filtering techniques are used to predict how well a user will like an item that he/she has not rated given a set of historical preference judgments for a community of users. Collaborative Filtering
Technique: Nearest Neighbor • User-Based Nearest Neighbor • Given a user u, generate a prediction for an item i by using the ratings for i from users in u’s neighborhood • Need to define similarity measure and neighborhood size
Technique: Nearest Neighbor • User-Based Nearest Neighbor • Given a user u, generate a prediction for an item i by using the ratings for i from users in u’s neighborhood • Neighbor = users with similar interests Average ratings of neighbor n Average ratings of user u
Technique: Nearest Neighbor • Item-Based Nearest Neighbor • Given a user u, generate a prediction for an item i by using a weighted sum of the user u’s ratings for items that are most similar to i.
Technique: Nearest Neighbor • Item-Based Nearest Neighbor • Given a user u, generate a prediction for an item i by using a weighted sum of the user u’s ratings for items that are most similar to i.
Similarity Measure • Numerical measure of how alike two data instances are. • Higher when the instances are more alike • Examples of similarity measures
Jaccard Similarity • Let x and y be a pair of binary 0/1 vectors • Mij: number of elements in which x = i and y = j Jaccard(x, y) = (M11) / (M01 + M10 + M11) • Example Jaccard(John, Mary) = = 0.25
Cosine Similarity • If d1 and d2 are two document vectors, then cos( d1, d2 ) = (d1d2) / ||d1|| ||d2|| , where indicates vector dot product and || d || is the length of vector d. • Example: d1= 3 2 0 5 0 0 0 2 0 0 d2 = 1 0 0 0 0 0 0 1 0 2 d1d2= 3*1 + 2*0 + 0*0 + 5*0 + 0*0 + 0*0 + 0*0 + 2*1 + 0*0 + 0*2 = 5 ||d1|| = (3*3+2*2+0*0+5*5+0*0+0*0+0*0+2*2+0*0+0*0)0.5 = (42) 0.5 = 6.481 ||d2|| = (1*1+0*0+0*0+0*0+0*0+0*0+0*0+1*1+0*0+2*2)0.5= (6) 0.5 = 2.245 cos( d1, d2 ) = .3150
Gaussian Radial Basis Function • Let x and y be the feature vectors for 2 data instances • Example: = 0.1
Technique: Matrix Factorization • Items are not independent and have inherent groupings • Movies can be grouped based on genres • Books can be grouped based on their topic areas • The groups can be treated as “latent” features of the data Given: ratings matrix R (users x items)
Technique: Matrix Factorization • Movie ratings What if genre is not the optimal grouping (since some movies may belong to multiple genres)? Can we automatically find an appropriate grouping of features?
Technique: Matrix Factorization • Given: ratings matrix R (users x items) • Goal is to factorize R into a product of two latent matrices, U and M, such that the following quantity is minimized: • where (R) is the set of non-missing ratings in R
Technique: Matrix Factorization Given: ratings matrix R (users x items) Goal: To decompose matrix R into a product of matrices U and MT (the superscript T denote a matrix transpose operation) that best approximates R T Predicted matrix U MT user feature matrix U (users features) item feature matrix M (items features) =
Technique: Matrix Factorization • Given: an incomplete matrix R and parameter k • Alternating least-square (ALS) algorithm • Randomly initialize U, M, and the missing values in R • Repeat until convergence • Find M such that ||R – UMT||F is minimized • Find U such that ||RT–MUT||F is minimized • For each missing value in Rij, replace with the corresponding value in (UMT)ij
Example ratings matrix R (users x items) Iteration = 1 U = M = UMT =
Example ratings matrix R (users x items) Iteration = 50 U = M = UMT =
Example ratings matrix R (users x items) Iteration = 100 U = M = UMT =
Example ratings matrix R (users x items) Iteration = 200 U = M = UMT =
Example ratings matrix R (users x items) Iteration = 500 U = M = UMT =
Cold-Start Problem • What will you recommend to a new user who has not provided any ratings? • Utilize side information to make the recommendation • Examples: demographic and item content information • How to incorporate side information? • Factorization machines or more generally, User features Item features Cast into a regression problem!