1 / 28

Rachid Guerraoui , EPFL

Rachid Guerraoui , EPFL. R ecommendation systems are good. What is a good recommendation system?. A good recommendation system is one that provides good recommendations. What is a good recommendation?. You know it when you see it.

mieko
Download Presentation

Rachid Guerraoui , EPFL

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. RachidGuerraoui, EPFL

  2. Recommendation systems are good What is a good recommendation system?

  3. A good recommendation system is one that provides good recommendations What is a good recommendation?

  4. You know it when you see it • “ I shall not today attempt further to define the kinds of material I understand to be embraced within that shorthand description ["hard-core pornography"]; and perhaps I could never succeed in intelligibly doing so. But I know it when I see it, and the motion picture involved in this case is not that” • Justice Potter Stewart, US Supreme Court, 1964

  5. What is a good recommendation system ? Ideally: Build and deploy your system Pragmatic: Transform past into future

  6. Example • Members of program committee (20) want to evaluate the submitted papers (200) • Nobody has enough time to read all papers • Each researcher is assigned a subset of papers • A recommendation system uses the scores to find the opinion of all members about all papers

  7. What is a good recommendation? It depends on the correlation Theory to the rescue

  8. General recommendation model • nusers • k* nobjects • For each user and object: a grade • The grades of a user form his preference vector • The vectors of users form the preference matrix • Grades may be binary, discrete, continuous

  9. Input? Vectors of grades: v(p) (known partially to the players) Output? Vectors of grades: w(p) (seeking to approximate v(p))

  10. Ideal output w(p) = v(p) Target output • Minimize max |w(p)-v(p)| (Hamming distance)

  11. How to account for the level of correlation? Compare with a perfect on-line algorithm

  12. The perfect on-line algorithm (1) All players know all partial vectors Shared billboard

  13. The perfect on-line algorithm (2) Chooses elements of the partial vectors to fill (B budget) The algorithm assigns initial papers The player is initially indulgent (learning phase)

  14. The perfect on-line algorithm (3) Knows the level of correlation Hamming diameter of a set P

  15. 20 pc members; 200 papers Every member can read 10 papers All have the same taste Perfect solution possible?

  16. 20 pc members; 200 papers Two clusters of 10 have the same taste Perfect solution possible? Every member needs to read 20

  17. Assume player p can probe B objects How many other players does p need to collaborate with to fill its vector? n/B*k – 1

  18. 20 pc members; 200 papers 4 clusters of 5 with diameter 8 Every member reads 20 What is the minimal error rate?

  19. Ideal algorithm (k=1) • A playerp has to use ideas of (n/B)-1 other players to estimate her/his preferences • The rate of error for pdepends on the hamming distance between pand the other (n/B)players • This is with a constant factor of the diameter of these n/B players In the worst case, p cannot do better

  20. Claim For every B-algorithm, there is some distribution of preferences such that (with constant probability)

  21. Proof (sketch) Consider a constant D > 2B Define a preference vector as follows: Let P be a set of players of size n/B • Let p in P with a random preference vector • Assign a random preference vector outside P Choose a set S of D objects. For every player q in P, v(q)=v(p) except in S which is random

  22. Proof (sketch) • Probes outside P provide no information to p • Probes inside P provide no information to pw.r.t S • Since p probes at most B objects and S contains D > 2B objects, there are at least D/2 objects for which p has no information • No algorithm can do better than guess preferences in S • The rate of error is at least D/4 and the diameter of P is less than D

  23. Optimality An algorithm is (B,c)-optimal if for every input set of preferences

  24. So what? The best we can do is find clusters of players that are - Small enough (small diameter) to provide “accurate” preferences And - Big enough to cover all objects • Practically speaking? • - Try different sizes of clusters

  25. Optimality • Assume each player can evaluate B objects. • Given B, and the level of correlation among players, there is a minimum rate of error that can be achieved. • There is an algorithm that obtains a constant approximation of this error-rate, and each player evalutesO(B.Polylog(n)) objects.

  26. Definition of Optimality • An algorithm is asymptotically optimal in terms of error rate, if for every player p we have: • |w(p)-v(p)| < min|P|>n/B-1cD(P) • Where c is a constant and D(P) is the diameter of set P. P can be any set of players with size at least n/B.

More Related