180 likes | 373 Views
Collaborative Filtering: Latent Variable Model. LIU Tengfei Computer Science and Engineering Department April 13, 2011. Outline. Overview of CF approaches Model based approach-latent variable model Probabilistic latent semantic analysis (PLSA) Other latent variable models Summary.
E N D
Collaborative Filtering: Latent Variable Model LIU Tengfei Computer Science and Engineering Department April 13, 2011
Outline • Overview of CF approaches • Model based approach-latent variable model • Probabilistic latent semantic analysis (PLSA) • Other latent variable models • Summary
Overview of CF Approaches • CF Categories • Memory-based CF • Conduct certain forms of nearest neighbor search in order to predict the rating for particular use-item pair. • Model-based CF • Train a compact model that explains the given data so that ratings could be predicted via the model.
Outline • Overview of CF approaches • Model based approach • Probabilistic latent semantic analysis (PLSA) • Other latent variable model • Summary
Model based approach • Question: • What is the shortcomings of memory based methods? • Reasons: • Suboptimal solution problem • Little knowledge learned from data • Computationally expensive in local-neighbor search • ……
Probabilistic Latent Semantic Analysis • The Problem • We want to predict the rating r that user u may assign to item i • Why latent variable model? • Consider a simple case: • User x like/dislike item y “because of” some reason • The reason can not be observed, but may exist • We introduce a latent variable to model it
Probabilistic Latent Semantic Analysis Probability that class z (can be seen as community in CF) would assign score r to item i. Mixing proportion • Question: what rating that user u is likely to give to item i? • Can we describe it with probability? • The probability that the rating a user give to an item is decomposed into a sum of products. • z is the latent variable
Probabilistic Latent Semantic Analysis Intuitive Graph Representation
Probabilistic Latent Semantic Analysis Model as a Gaussian distribution Mixing proportion can be modeled as a categorical distribution
Probabilistic Latent Semantic Analysis To make predictions, we compute the expected rating
Probabilistic Latent Semantic Analysis Model parameters can be learnt by maximizing the following log likelihood of observed data This can be readily solved using EM algorithm
Probabilistic Latent Semantic Analysis • Question 1: • how to learn the model parameters by EM algorithm? • Question 2: • how to understand EM algorithm?
Other latent variable models • Probabilistic latent preference analysis (PLPA) • Reference: • NN. Liu et al, Probabilistic Latent Preference Analysis for Collaborative Filtering, CIKM’09
Outline • Overview of CF approaches • Model based approach-latent variable model • Probabilistic latent semantic analysis (PLSA) • Other latent variable models • Summary
Summary • CF is popular • Memory based method • Advantages and shortcomings • Model based method • Latent variable model • Probabilistic latent semantic analysis • Other latent variable models
Summary Thank you ! Questions?
Reference • Thomas Hofmann, Collaborative Filtering via Gaussian Probabilistic Latent Semantic Analysis, SIGIR 2003 • Thomas Hofmann, Latent Semantic Models for Collaborative Filtering, In ACM Transactions on Information Systems, 2004 • Abhinandan Das et al, Google News Personalization: Scalable Online Collaborative Filtering, WWW 2007 • NN. Liu et al, Probabilistic Latent Preference Analysis for Collaborative Filtering, CIKM’09 • Xiaoyuan Su et al, A Survey of collaborative Filtering Techniques, 2009