1 / 23

Rethinking the ESP Game

Rethinking the ESP Game. Stephen Robertson, Milan Vojnovic, Ingmar Weber* Microsoft Research & Yahoo! Research *This work was done while I was a visiting researcher at MSRC. The ESP Game – Live Demo. Show it live . (2min) Alternative version. The ESP Game - Summary.

sherry
Download Presentation

Rethinking the ESP Game

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Rethinking the ESP Game Stephen Robertson, Milan Vojnovic, Ingmar Weber* Microsoft Research & Yahoo! Research *This work was done while I was a visiting researcher at MSRC.

  2. The ESP Game – Live Demo Show it live. (2min) Alternative version.

  3. The ESP Game - Summary • Two players try to agree on a label to be added to an image • No way to communicate • Entered labels only revealed at end • Known labels are “off-limits” • ESP refers to “Extrasensory perception” • Read the other person’s mind

  4. The ESP Game - History • Developed by Luis von Ahn and Laura Dabbish at CMU in 2004 • Goal: Improve image search • Licensed by Google in 2006 • A prime example of harvesting human intelligence for difficult tasks • Many variants (music, shapes, …)

  5. The ESP Game – Strengths and Weaknesses • Strengths • Creative approach to a hard problem • Fun to play • Vast majority of labels are appropriate • Difficult to spam • Powerful idea: Reaching consensus with little or no communication

  6. The ESP Game – Strengths and Weaknesses • Weaknesses • The ultimate object is ill-defined • Finds mostly general labels • Already millions of images for these • “Lowest common denominator” problem • Human time is used sub-optimally

  7. A “Robot” Playing the ESP Game Video of recorded play.

  8. The ESP Game – Labels are Predictable • Synonyms are redundant • “guy” => “man” for 81% of images • Co-occurrence reduces “new” information • “clouds” => “sky” for 68% of images • Colors are easy to agree on • “black” is 3.3% of all occurrences

  9. How to Predict the Next Label T = {“beach”, “water”}, next label t = ??

  10. How to Predict the Next Label Want to know: P(“blue” next label | {“beach”, “water”}) P(“car” next label | {“beach”, “water”}) P(“sky” next label | {“beach”, “water”}) P(“bcn” next label | {“beach”, “water”}) Problem of data sparsity!

  11. How to Predict the Next Label Want to know: P(“t” next label | T) = P(T | “t” next label)¢P(“t”) / P(T) Use conditional independence … Give a random topic to two people. Ask them to each think of 3 related terms. Bayes’ Theorem P(A,B) = P(A|B)¢P(B) = P(B|A)¢P(A)

  12. Conditional Independence Madrid sun paella beach soccer flamenco p1 p2 “Spain” sky water eyes azul blau bleu p1 “blue” p2 P(“p1: sky”, “p2: azul” | “blue”) = P(“p1: sky” | “blue”) ¢ P(“p2: azul” | “blue”) P(A,B|C) = P(A|C)¢P(B|C)

  13. How to Predict the Next Label C.I. Assumption violated in practice, but “close enough”. P({s1, s2} | “t”) ¢ P(“t”) / P(T) = P(s1 | “t”) ¢ P(s2 | “t”) ¢ P(“t”) / P(T) P(s | “t”) will still be zero very often ! smoothing P(s | “t”) = (1-¸) P(s | “t”) + ¸ P(s) Non-zero background probability

  14. How to Predict the Next Label P(“t” next label | T already present) = s2 TP(s | “t”) P(“t”) / C where C is a normalizing constant ¸ chosen using a “validation set”. ¸ = 0.85 in the experiments. Model trained on ~13,000 tag sets. Also see: Naïve Bayes classifier Cond. indep. assumption Bayes’ Theorem

  15. Experimental Results: Part 1 Number of • games played 205 • images encountered 1,335 • images w/ OLT 1,105 Percentage w/ match • all images 69% • only images with OLTs 81% • all entered tags 17% Av. number of labels entered • per image 4.1 • per game 26.7 Agreement index • mean 2.6 • median 2.0 The “robot” plays reasonably well. The “robot” plays human-like.

  16. Quantifying “Predictability” and “Information” So, labels are fairly predictable. But how can we quantify “predictability”?

  17. Quantifying “Predictability” and “Information” • “sunny” vs. “cloudy” tomorrow in BCN • The role of a cubic dice • The next single letter in “barcelo*” • The next single letter in “re*” • Clicked search result for “yahoo research”

  18. Entropy and Information • An event occurring with probability p corresponds to an information of -log2(p) bits ... … number of bits required to encode in optimally compressed encoding • Example: Compressed weather forecast: P(“sunny”) = 0.5 0 (1 bit) P(“cloudy”) = 0.25 10 (2 bits) P(“rain”) = 0.125 110 (3 bits) P(“thunderstorm”) = 0.125 111 (3 bits)

  19. Entropy and Information • p=1 ! 0 bits of information • Cubic dice showed a number in [1,6] • p¼0 ! many, many bits of information • The numbers for the lottery “information” = “amount of surprise”

  20. Entropy and Information • Expected information for p1, p2, …, pn: i -pi¢ log(pi) = (Shannon) entropy • Might not know true p1, p2, …, pn, but think they are p1, p2, …, pn. Then, w.r.t. p you observe i -pi¢ log(pi) minimized for p = p p given by earlier model. p is then observed.

  21. Experimental Results: Part 2 Later labels are more predictable. Equidistribution = 12.3 bits. “Static” distribution = 9.3 bits. Human thinks harder and harder.

  22. Improving the ESP Game • Could score points according to –log2(p) • - Number of bits of information added to the system • Have an activation time limit for “obvious” labels • Remove the immediate satisfaction for simple matches • Hide off-limits terms • Have to be more careful to avoid “obvious” labels • Try to match “experts” • - Use previous tags or meta information • Educate players • - Use previously labeled images to unlearn behavior • Automatically expand the off-limits list • - Easy, but 10+ terms not practical

  23. Questions Thank you! ingmar@yahoo-inc.com

More Related