510 likes | 640 Views
Active Learning for Preferences Elicitation in Recommender Systems. Lior Rokach Department of Information System Engineering :. Agenda. Background - Active Learning and Recommender Systems Proposed Method Experimental Procedure Results and Discussion Conclusions and Future Work.
E N D
Active Learning for Preferences Elicitation in Recommender Systems Lior Rokach Department of Information System Engineering :
Agenda • Background - Active Learning and Recommender Systems • Proposed Method • Experimental Procedure • Results and Discussion • Conclusions and Future Work
Recommender Systems • Users are overloaded by options to consider before making a decision • such as item to purchase • Recommender systems aim at supporting the user in the processes of • decision-making • planning • purchasing
Collaborative Filtering • Maintain users’ ratings of a variety of items. • For a given user: • Find other similar users whose ratings strongly correlate with the current user • Recommend items rated highly by these similar users, but not rated by the current user. • Almost all existing commercial recommenders use this approach (e.g. Amazon).
Active Learning • Traditional supervised learning algorithms • passively accept labeled training data and induce prediction model • Active learning • useful when unlabeled data is abundant • labels are expensive • allows intelligent selection of which examples to label. Passive Learning Active Learning
Using Active Learning for Initial Preferences Elicitation • The cold start problem • very little is known about the preferences of new users • Possible modus operandi • Ask the user to rate a few items • Which items ? Active Learning
Using Active Learning for Initial Preferences Elicitation Active Learning Active Learning
Active Learning in Critique-Based Recommender Systems(Ricci and Nguyen, 2007) • A series of interaction cycles to • narrow down the user’s query • until the desired item is obtained
Integrating Active Learning in CF-based Recommender Systems • Active Learning (AL) in RecSys • accurately predicts items of interest to the user • while gaining information about her preferences. • In this lecture we focus on • Uncertainty Active Collaborative Filtering • Boutilier et al. (2003) • Rong and Luo (2004) • …
Our Contributions • Incorporate exploration and exploitation trade-off. • Work local – thinkglobal • Use the ratings of one user to contribute to other users • Introduce Cost-Sensitivity (Not going to talk about that) the value of information of new ratings the alternative utility for not presenting the best items VS
Agenda • Background - Active Learning and Recommender Systems • Proposed Method • Experimental Procedure • Results and Discussion • Conclusions and Future Work
Preliminaries • Binary rating: Like/Dislike – • Explicit • Implicit - Based on user actions such as: • Buy • Click the item for additional details • Provide a recommendation of top n items • User selects from this list • Ignore the fact she can browse the remaining items. • We use a simple item-to-item NN CF • similarity measure such as Pearson correlation.
Item-to-Item NN CF with Binary Ratings rui*can be used to approximate the probability that user u would like item i. Some use Jaccard coefficient instead
Probabilistic Approach • Employ rule of succession (Laplace correction) • find the conditional probability for positive response in the next presentation of item i to user u: where itemSim should be normalized such that:
Mathematical interlude: Rule of succession • The proportionp of positive response is treated as a uniformly distributed random variable • Some claim that p is not random, but uncertain • We assign a probability distribution to p to express uncertainty, not to attribute randomness • Let Xi,j indicator variable • equals 1 when user i positively responded to an item j with probability pjof success (0 otherwise) • has a Bernoulli distribution.
Mathematical interlude: Rule of succession – cont. • Suppose these Xs are conditionally independent given pjthus the likelihood is: • The conditional probability distribution of pjgiven the data Xi,j, i = 1, ..., n, is the multiplication of the "prior" (i.e., marginal) probability measure assigned to pj by the likelihood function (Bayes' theorem)
The posterior probability density function is This is a beta distribution with expected value Rule of succession implies the conditional probability for positive response in the next presentation of item j given pj, is just pj. Mathematical interlude: Rule of succession – cont.
The Benefit and Risk of a Top 1 Recommendation • A simple scenario: • Recommend the best (top 1) item from only two possible items The risk: The presented item (item1) is not selected by the user, but if item 2 was presented to the user it would have been chosen
Risk Reduction • Risk reduces as more ratings become available
Risk Reduction Calculation Estimate CurrentRisk rui Positive With probability P(u,i)=0.2 Negative with probability 1-P(u,i)=0.8 Rebuild Recommendation List assuming rui=1 Rebuild Recommendation List assuming rui=0 Estimate NewRisk Estimate NewRisk RiskReduction = CurrentRisk - NewRisk
Loss/Utility • If the net revenues of the items are known, the risk/benefit is easily converted into loss/utility.
The Benefit and Risk of Top 1 Recommendation • Extended scenario: • Recommending the best (top 1) item from n possible items As Before
The Benefit and Risk of Top 1 Recommendation • High number of items limits the use of this formula in practice • Fortunately easy to calculate tight lower and upper bounds exist (Prekopa and Gao, 2005)
Cascaded risk reduction for top n • Assumptions: • User selects only one item (positive response) • User reviews the items according to the their order in the list Estimate CurrentRisk ru1 P(u,1) 1-P(u,1) ru2 Estimate NewRisk P(u,2) 1-P(u,2) ru3 Estimate NewRisk 1-P(u,3) P(u,3) . . . Estimate NewRisk
Multiple Users • When user u provides an additional rating, • not only the risk/benefit of user u evolves • but also the risk/benefit of other users (CollaborativeFiltering)
Goal Formulation • U – set of Users • I – set of Items • DRLj– default ranked list for user j. • For example the list which would be selected by CF according to r*ui. • A Ranked List is an ordered set of pairs
Goal Formulation – cont. • Find PRLu (ranked list to be presented to user u) such that: • wv – weight for user v • Frequent users should have larger weights. • Tkis used to control the exploration/exploitation tradeoff • We employ simulated annealing with a simple and common exponential schedule:
Switching from active to passive Risk reduction converges to zero as number of ratings tends to infinity. • When sufficient ratings are achieved, go from active to passive
Proposition 1: Who is affected? • When a new rating for item i by user u, is added to an item-to-item NN CF, • the recommendation list of user v≠u is revised iff user v has rated at least one item that has been rated by user u. • Proof: • Straightforward
Illustration of Proposition 1 Not affected X
Proposition 2: How many are affected? • Assumption: the provided ratings are scattered uniformly over the item-users matrix, • Expected proportion of users affected by adding a new rating is: • where • N is the total number of items • n is the mean number of ratings provided by a single user • Example • N=2,000,000, n=210 prop=2% • N=17,000, n=210 prop=91%
A Greedy Algorithm • Finding the optimal ranked list for user u is a computationally intractable problem • Approximated solution • Greedily select the items to be presented from the top k·l items of user u, • l is the number of items presented in a single page, • kis a small integer • Calculate the risk reduction for a sample of m users selected randomly from all potentially affected users • Approximate the actual reduction by simple scaling
Computational Complexity • Assuming hash map structures for: • Rated items for each user • Rating users for each item • DRL for each user Greedily select items in the list Benefit Risk reduction
Agenda • Background - Active Learning and Recommender Systems • Proposed Method • Experimental Procedure • Results and Discussion • Conclusions and Future Work
The Experiments’ Goals • Compare the proposed active learning algorithm to passive learning. • Evaluate • contribution of the global effect • Monte-Carlo procedure for selecting the affected users • scalability of the greedy algorithm • Our main evaluation criterion: • Precision
Data • We actively select items to be presented to the user and expect to obtain the user’s response to these items. • Available offline datasets (such as Netflix) are sparse and therefore cannot guarantee response to all items we present. • Three options: • Find several sub-matrices that are dense • Filter DRLs according to the items known to be rated by the user. • Work online.
Offline evaluation • Six mutually exclusive dense submatrices of 50 users over 50 movies were extracted from Netflix • Provided ratings where transform it into a binary scale (ratings above user’s average are considered positive). • In each iteration we randomly selected a user and simulate a request for obtaining a recommendation assuming l=5, k=5. • Initial probability estimation of items for all users is assumed to be uniform.
Agenda • Background - Active Learning and Recommender Systems • Proposed Method • Experimental Procedure • Results and Discussion • Conclusions and Future Work
Offline (Netflix) ResultsPassive vs. Active Both methods display a unimodal peak quadratic-like growth. Both converges to the same value. The positive effect of active learning is maximally observed around of 200 sessions with an improvement of 15%.
Offline (Netflix) ResultsThe effect of recommendation list size (k)
Offline (Netflix) ResultsThe effect of number of referred users (m)
How much time it really takes? • In a real application • 8,000,000 items • 1,000,000 users • l=10,k=10,m=200 • About 12 msec with IntelCore Duo CPU E7400 @ 2.80GHz. • l=10,k=5,m=200 • About 1.5 msec
Agenda • Background - Active Learning and Recommender Systems • Proposed Method • Experimental Procedure • Results and Discussion • Conclusions and Future Work
Conclusions • A new Uncertainty Active Collaborative Filtering method has been developed. • The new method takes into consideration the global effect. • The new method can improve objective and subjective performance.
Drawbacks • Like any Uncertainty-based AL reducing uncertainty may not always improve accuracy (Rubens et al., 2010) • A more intensive calculation than the passive CF.
Future Work • Evaluate on a large dataset (under investigation) • Extends the method to other CF algorithms and compare to other Active Learning CF (under investigation) • Evaluate the method on a large scale online system (Scheduled to 4/2010) • Extend the algorithm to a non-binary scale (5 stars) • Develop a batch mode algorithm • Develop a better sampling method for selection of the affected users • Consider other heuristics • Taking into consideration the temporal aspect (Netflix)
Thank You, Lior Rokach Email: liorrk@bgu.ac.il