230 likes | 447 Views
Peeka boom : A Game for Locating Objects in Images. Roy Liu Carnegie Mellon University Lamps of Aladdin Presentation Joint work with Luis von Ahn and Manuel Blum. Object Location in Images. Given an image, determine what objects there are and locate them:. Woman Man Umbrella Tree
E N D
Peekaboom: A Game for Locating Objects in Images Roy Liu Carnegie Mellon University Lamps of Aladdin Presentation Joint work with Luis von Ahn and Manuel Blum
Object Location in Images Given an image, determine what objects there are and locate them: Woman Man Umbrella Tree Sailboat Dog
As Things Stand Now • No algorithm is known for taking an image and determining what objects are in it, let alone telling you where they are. • Fortunately, this talk isn’t about developing such an algorithm. Let’s try a different approach.
Harnessing the Power of Human Cycles • “Math is hard. Let’s go shopping!” –Barbie • On similar line of thinking: • Programming computers to locate objects in images is hard, so… • Let’s not think about it. • Instead, let’s get humans to do the work for us?
Problems • Wait! Your average human probably wants: • Enjoyment – they want to have a good time • Incentives – they want something in return • How do we address both?
A Game • Have people do the work for us by playing a game. • Many design issues arise: • What will be the core idea of the game? • How do we collect data? • How do we ensure the quality of the data?
An Earlier Idea: Luis von Ahn’s ESP Game • Two players, with no communication, each try to guess what the other is thinking about a particular image they both see. • If they agree on a word, the game moves on and increases both players’ scores.
A Sample Run Player 1 Guesses • Pants • Model • Lady Player 2 Guesses • Woman • Shirt • Denim • Girl • Model Server: Agreed, “Model”
Why ESP Works • By agreeing on a word, the players: • Say what it is – we call this assigning a “label” to the image. • Check their own work – the fact that two strangers agree on a label is a witness of the label’s quality.
The Limitations of ESP • The ESP Game can label images (and consequently tell you what’s in them), but it cannot: • Find the objects being labeled. • Determine the way in which the object appears – does the label “car” refer to the text “car” or an actual car in the image?
unlabeled images labeled images located images Peekaboom game server ESP game server Completing the Image Cycle
A New Idea: Peekaboom • Two players are assigned the roles of “revealer” and “guesser”. • The revealer sees an image with a label. The guesser sees nothing. • The revealer shows the guesser parts of the image. If the guesser guesses correctly, the game moves on.
Statement of Purpose • We would like to collect data about images systematically and en masse. • We hope our collection will form the basis for data sets that can be used to train computer vision algorithms.
The Guesser guesses: • Flower • Petal • Butterfly The Revealer clicks on parts of the image and shows them to the Guesser. Server: Correct, Butterfly
Why Peekaboom Works • By getting the guesser to guess correctly, the revealer locates objects by clicking on the relevant parts of the image:
But Wait, There’s More • Peekaboom not only locates objects, but also: • Gives the context necessary to identify them. • Classifies the image as “Text”, “Noun”, or “Verb” by way of hints. • Let’s examine how Peekaboom does both.
Object Context The label: trunk • Pings help separate the context of object with the object itself. • They help the guesser distinguish trunk from other possibly correct labels like “elephant”, “tusk”, and “ear”.
Hints The label “car” is ambiguous -- this is “car” this is also “car” The hints help distinguish the manner in which the label “car” appears: this is the object “car” this is the text “car”
players players game server raw data labeled images located images compiler researchers The Architecture of Peekaboom
Peekaboom is… • fun • novel • aesthetically appealing • networked • scalable • widely deployable
Conclusions • Peekaboom will be released to a general audience within two months. • We hope that it will solve difficult AI tasks while achieving: • Low costs – One game server. • Quality – Accurately locate objects in images. • Quantity – Locate objects in millions of images.
Advertisement • The best way to understand this talk is to try the game out for yourself:www.peekaboom.org • We look forward to collecting your cycles!