1 / 23

Matchin: Eliciting User Preferences with an Online Game

Matchin: Eliciting User Preferences with an Online Game. Severin Hacker, and Luis von Ahn Carnegie Mellon University SIGCHI 2009. Matchin. A game that asks two randomly chosen partners "which of these two images do you think your partner prefers?". Some Findings.

admon
Download Presentation

Matchin: Eliciting User Preferences with an Online Game

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Matchin: Eliciting User Preferences with an Online Game Severin Hacker, and Luis von Ahn Carnegie Mellon University SIGCHI 2009

  2. Matchin • A game that asks two randomly chosen partners "which of these two images do you think your partner prefers?"

  3. Some Findings • It is possible to extract a global "beauty" ranking within a large collection of images. • It is possible to extract the person's general image preferences. • Their model can determine a player's gender with high probability.

  4. A Taxonomy of Methods • Absolute Versus Relative Judgments • Total Versus Partial Judgments • Random Access Versus Predefined Access • "I Like" Versus "Others Like" • Direct Versus Indirect

  5. Existing Methods • Flickr Interestingness • Voting • Hot or Not

  6. The Mechanism • Matchin is a two-player game that is played over the Internet. • Every game takes two minutes. • One pair of images usually takes between two to five seconds. • Matchin uses a collection of 80,000 images from Flickr that were gathered October 2007.

  7. The Scoring Function • Matchin uses a sigmoid function for scoring games. • Constant scoring function • Players could get many points by quickly picking the images at random. • Exponential scoring function • The rewards sometimes became too high

  8. The Data • The game was launched on May 15, 2008. • Within only four months, 86,686 games had been played by 14,993 players. • There have been 3,562,856 individual decisions (clicks) on images. • An individual decision/record is stored in the form: • <id, game_id, player, better, worse, time, waiting_time>

  9. Ranking Functions • Empirical Winning Rate (EWR) • ELO Rating • TrueSkill Rating

  10. Empirical Winning Rate (EWR) • Function: • Two problems: • For images that have a low degree, the empirical winning rate might be artificially high or low. • It does not take the quality of the competing image into account.

  11. ELO Rating (1/2) • The ELO rating system was introduced for rating chess players. • Each chess player’s performance in a game is modeled as a normally distributed random variable. • The mean of that random variable should reflect the player’s true skill and is called the player’s ELO rating.

  12. ELO Rating (2/2) • Expected score: • ELO rating:

  13. Every player’s skill s is modeled as a normally distributed random variable centered around a mean μand per-player variance σ2. A player’s particular performance in a game then is drawn from a normal distribution with mean s and a per-game variance β 2. TrueSkill Rating (1/2)

  14. TrueSkill Rating (2/2) • Update: • Conservative skill estimate:

  15. Collaborative Filtering (1/2) • In the collaborative filtering setting, they want to find out about each individual's preferences • recommend images to each user based on his/her preferences • compare users and images with each other • They have developed a new collaborative filtering algorithm they call “Relative SVD”

  16. Collaborative Filtering (2/2) • The user feature vectors: • The image feature vectors: • The amount by which user i likes image j • Data: a set D of triplets (i,j,k) • The error for a particular decision: • The total sum of squared errors (SSE):

  17. Comparison of the Models

  18. Local Minimum • Do humans learn while playing the game? • They compared the agreement rate of first-time players and other players. • the first-time players: 69.0% • the more experienced players: 71.8% • They have also measured if people learn within a game. • the first half of the game: 67% • the second half of the game : 64%

  19. Gender Prediction • The conditional entropy: • The necessary conditional probabilities Pr(G=g|X=x) can be computed with Bayes' rule given the class conditionals Pr(X=x|G=g). • The naïve Bayes classifier will maximize the likelihood of the data: • The total accuracy is 78.3%

  20. The Top Ranked Images

  21. Discussion (1/2) • The highest ranked pictures • sunsets, animals, flowers, churches, bridges, and famous tourist attractions • neither provocative nor offensive • The worst pictures • taken indoors and include a person • blurry or too dark • screenshots or pictures of documents or text

  22. Discussion (2/2) • There are substantial differences among players in judging images, and taking those differences into account can greatly help in predicting the users’ behavior on new images. • More experienced players had about the same error rate as new players.

  23. Conclusion • The main contribution of this paper is to provide a new method to elicit user preferences . • They compared several algorithms for combining these relative judgments into a total ordering and found that they can correctly predict a user’s behavior in 70% of the cases. • They describe a new algorithm called Relative SVD to perform collaborative filtering on pair-wise relative judgments. • They present a gender test that asks users to make some relative judgments and can predict a random user’s gender in roughly 4 out of 5 cases.

More Related