1 / 24

The Wisdom of the Few : A collaborative Filtering Approach Based on Expert Opinions from the Web

The Wisdom of the Few : A collaborative Filtering Approach Based on Expert Opinions from the Web. Xavier Amatriain, Josep M. Pujol, Nuria Oliver(Telefonica Research,Spain), Neal Lathia (University College of London,UK), Haewoon Kwak(KAIST,Korea)

jerica
Download Presentation

The Wisdom of the Few : A collaborative Filtering Approach Based on Expert Opinions from the Web

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Wisdom of the Few : A collaborative Filtering Approach Based on Expert Opinions from the Web Xavier Amatriain, Josep M. Pujol, Nuria Oliver(Telefonica Research,Spain), Neal Lathia (University College of London,UK), Haewoon Kwak(KAIST,Korea) Session : Recommenders ,SIGIR’09 July 19-23,2009 Intelligent Database Systems Lab School of Computer Science & Engineering Seoul National University, Seoul, Korea 2010-02-10 SungEun Park

  2. Brief Summary User Expert1 User User User Expert2 User Expert3 • Collaborative Filtering Approach • The wisdom of the crowd? • Try expert opinions on CF as an external data source. • Neighbor CF vs. Experts CF • Excellent Point • Experiment and analysis on dataset and the Expert-CF method.

  3. Contents • Goal • Dataset Analysis • Expert Nearest-neighbors Approach • Experiment • Mean Average Error • Top-N Recommendation Precision • User Study • Discussion

  4. Goal • The goal is not to increase CF accuracy • But.. • How can preferences of a large population be predicted by using a very small set of users. • Understand the potential of an independent and uncorrelated data set to generate recommendations. • Analyze whether professional raters are good predictors for general users. • Discuss how this approach addresses some of the traditional pitfalls in CF.

  5. Dataset • Expert data set : Rottentomatoes.com • Aggregates the opinions of movie critics from various media sources. • 169 experts out of 1750 experts :threshold of rating >250 • User data set : Netflix.com movie reviews • 8,000 out of 17,770 movies • Matching movies to ones in Rotten tomatoes • 20% of the movies have one rating.

  6. Dataset Analysis(1) User Ratings Count CDF(cumulative distribution function) : 임의 변수 u의 확률 밀도함수(p.d.f.) P[u]의 누적 적분으로 정의 • Number of Ratings and Data Sparsity • The expert matrix is less sparse than the user matrix and more evenly distributed. • Sparsity Coefficient : Users 0.01 vs. Expert 0.07 • One expert has more ratings than users.

  7. Dataset Analysis(2) Movie User Ratings • Average Rating Distribution • Experts tend to act similarly. • Their overall opinions on the movies is more varied

  8. Dataset Analysis(3) Wider curve • Rating Standard Deviation • Experts tend to agree more than regular users • Lower standard deviation • Experts tend to deviate less from their personal average rating

  9. Expert Nearest-neighbors Approach Adjusting factor for To take into account the # of items co-rated by both users • Similarity between each user and the experts set. • Similarity • a,b : users • Na, Nb : the number of items rated by each user • Na∪b : the number of co-rated items • Look only for the experts whose similarity to the given user is greater than δ. • risk of finding very few neighbors • confidence threshold τ : the minimum number of expert neighbors who must have rated the item in order to trust their prediction.

  10. Expert Nearest-neighbors Approach • sim: V × V → R, we define a set of experts E = {e1, ..., ek} ⊆ V and a set of users U = {u1, ..., uN} ⊆ V . Given a particular user u ⊆ U and a value δ, we find the set of experts E′ ⊆ E such that: ∀e ⊆ E′ ⇒ sim(u, e) ≥ δ. • E′′ ⊆ E′ such that ∀e ⊆ E′′ ⇒ rei ≠ ◦, where rei is the rating of item i by expert e ⊆ E′, and ◦ is the value of the unrated item. • E′′ = e1...en, • No prediction when n < τ • Predicted Rating computed when n ≥ τ • Predicted rating

  11. Experiment Mean Average Error and Coverage Top-N Recommendation Precision User Study

  12. Experiment(1): Mean Average Error and Coverage Interplay between τ and δ τ = 10 and δ = 0.01

  13. Experiment(1): Mean Average Error and Coverage 5-fold cross-validation : user data set (by random sampling) into 80% training - 20% testing sets τ = 10 and δ = 0.01 Worst-case baseline : “critics’ choice” recommendation A significant accuracy improvement than using the experts’ average. Expert-CF shows 0.08 higher MAE but 6% higher coverage

  14. Experiment(1): Mean Average Error and Coverage Expert-CF is better NN-CF is better • Expert-CF is better for users with MAE > 1.0 • NN-CF is better for users with small MAE (less than 0.5) • Minority population around 10% • Expert-CF and NN-CF are similar in the rest of the range.

  15. Experiment(2): Top-N Recommendation Precision • Classify items as being recommendable or not recommendable given a threshold • For a given user, compute all predictions and present those greater or equal than σ to the user • if there is no item in the test set that is worth recommending to a given target user, we simply return an empty list. • For a given user, compute all predictions and present those greater than or equal to σ to the user • For all predicted items that are present in the user’s test set, look at whether it is a true positive (actual user rating greater or equal to σ) or a false positive (actual user rating less than σ). • Compute the precision of our classifications using the classical definition of this measure

  16. Experiment(2): Top-N Recommendation Precision σ For σ = 4, NN-CF clearly outperforms expert-CF. For σ = 3, the precision in both methods is similar. For users willing to accept recommendations for any above average(3) item, the expert-based method appears to behave as well as a standard NN-CF

  17. Experiment(3) : User Study • asked 57 participants to rate 100 preselected movies. • Random List: A random sequence of movie titles • Critics choice: The movies with the highest mean rating given by the experts. • Neighbor-CF: Similar users in the Netflix to each survey respondents’ • Expert-CF: Similar to (3), but using the expert dataset instead of the Netflix ratings. • Generate the recommendation lists based on limited user feedback : The average number of ratings was of 14.5 per participant → Cold Start Condition.

  18. Experiment(3) : User Study 50% overall quality of the recommendation lists Overall quality : average response The expert-CF approach is the only method that obtains an average rating higher than 3 Critics’ choice and expert-CF are the only approaches that are qualified as very good are expert based.

  19. Experiment(3) : User Study • The ratings to the question ”the list contains movies I think I would like or not like.” • an important aspect on the evaluation of a recommender system is how often the user is disappointed by the results • Recommending wrong items mines the user’s assessment of the system and compromises its usability. • The expert-CF approach generates the least negative response when compared to the other methods.

  20. Experiment(3) : User Study P-values • An analysis of variance : test whether the differences between the four recommendation lists are statistically significant or not. • The null hypothesis : the average user evaluation for the four different lists is the same. • P-values smaller than 0.01 : a rejection of the null hypothesis • the differences on the user satisfaction from the three baseline methods is not statistically significant • the differences on the user satisfaction from Expert-CF are statistically meaningful

  21. Discussion • using a limited set of external experts to generate the predictions • Data Sparsity • domain experts are more likely to have rated a large percentage of the items • Noise and Malicious Ratings • reducing noise : Experts are expected to be more consistent and conscious with their ratings • Cold Start Problem • Motivated expert users typically rate a new item entering the collection as soon as they know of its existence and therefore minimize item cold-start.

  22. Discussion • Scalability • Computing the similarity matrix : O(N2M) • for N users in an M-item collection • Experts is much smaller scale than users (169 experts vs. 500, 000 potential neighbors) • Privacy • Only needs the target users’ profile and the current expert ratings.

  23. Conclusion A reduced set of expert ratings can predict the ratings of a large population. The method’s performance is comparable to traditional CF algorithms, even when using an extremely small expert set. addresses some of the shortcomings of traditional CF: data sparsity, scalability, noise in user feed-back, privacy and the cold-start problem.

  24. Q&A Thank you…

More Related