1 / 25

How Crowdsourcable is Your Task?

How Crowdsourcable is Your Task?. Carsten Eickhoff Arjen P. de Vries. WSDM 2011 Workshop on Crowdsourcing for Search and Data Mining (CSDM 2011), Hong Kong, China, February 9–12, 2011. OOutline. The Crowdsourcing Boom Crowdsourcing, a Tale of Great Romance

rusty
Download Presentation

How Crowdsourcable is Your Task?

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. How Crowdsourcable is Your Task? Carsten EickhoffArjen P. de Vries WSDM 2011 Workshop on Crowdsourcing for Search and Data Mining (CSDM 2011),Hong Kong, China, February 9–12, 2011.

  2. OOutline • The Crowdsourcing Boom • Crowdsourcing, a Tale of Great Romance • A Journey to the Dark Side of Crowdsourcing • Is all Lost? • Conclusions

  3. IThe Crowdsourcing Boom • Billions of judgements are being crowdsourced each year • CrowdFlower – Judgement volume doubled (2009-2010) • Significant numbers of research publications rely on crowdsourcing to create scientific resources • ...but is it actually reliable?

  4. OOutline • The Crowdsourcing Boom • Crowdsourcing, a Tale of Great Romance • A Journey to the Dark Side of Crowdsourcing • Is all Lost? • Conclusions

  5. ICrowdsourcing – A Tale of Great Romance • Summer 2008 • How do I quickly get a large number of judgements? • Task: Message grouping for discourse understanding • Crowdsourcing produced very reliable results

  6. ICrowdsourcing – A Tale of Great Romance • Summer 2008 • How do I quickly get a large number of judgements? • Task: Message grouping for discourse understanding • Crowdsourcing produced very reliable results

  7. ICrowdsourcing – A Tale of Great Romance • Fall 2008 • Crowdsourcing has become a standard data source • The excitement wears off

  8. ICrowdsourcing – A Tale of Great Romance • A dark and cold day in late autumn 2009 • You need judgements for yet another experiment

  9. ICrowdsourcing – A Tale of Great Romance • A dark and cold day in late autumn 2009 • You need judgements for yet another experiment • You get cheated!

  10. ICrowdsourcing – A Tale of Great Romance • A dark and cold day in late autumn 2009 • You need judgements for yet another experiment • You get cheated! • Again and again...

  11. OOutline • The Crowdsourcing Boom • Crowdsourcing, a Tale of Great Romance • A Journey to the Dark Side of Crowdsourcing • Is all Lost? • Conclusions

  12. Task-based overview What is it that malicious workers do? Do we have remedies? OA Journey to the Dark Side

  13. IA Journey to the Dark Side • Task: Closed class questions • Possible cheat: uniform answering (all yes/no) • Possible cheat: arbitrary answers • Remedy: Good gold standard data helps • Pitfall: Cheaters who think about the task at hand can cause a lot of trouble (e.g. relevance judgements)

  14. IA Journey to the Dark Side • Task: Open class questions • Possible cheat (1): Copy and paste standard text • Possible cheat (2): Copy and paste domain-specific text • Remedy: (1) is easy to detect. (2) is problematic

  15. IA Journey to the Dark Side • Task: Internal quality control • Possible cheat: artificially boost your own confidence • Possible cheat: even worse, do so in a network • Remedy: We need a better confidence measure than prior acceptance rate • Pitfall: Due to the large scale of HITs it is hard to find a reliable confidence measure

  16. IA Journey to the Dark Side • Task: External quality control • Setup: redirect workers to your own site and let them do the HITs there • Possible cheat: make up confirmation token • Possible cheat: re-use genuine token • Possible cheat: claim that you did not get a token • Remedy: all of the above are easy to detect

  17. OOutline • The Crowdsourcing Boom • Crowdsourcing, a Tale of Great Romance • A Journey to the Dark Side of Crowdsourcing • Is all Lost? • Conclusions

  18. EIs all Lost? • Posterior detection and filtering of cheaters works reliably • But we waste resources (money, time, nerves..) • Can we discourage cheaters from doing our HIT in the first place?

  19. EIs all Lost? • Which HIT types do cheaters like? • The Summer 2008 HIT hardly attracted any cheaters • The one in Autumn was swamped by them • The Summer task required a lot of creativity whereas the Autumn one was a straightforward relevance judgement

  20. EIs all Lost? • Hypothesis: “If the HIT conveys the impression of requiring creativity, cheaters are less likely to take it.” • 2 HIT types • Suitability for children • Standard relevance judgements

  21. ETask/Interface Design

  22. CCrowd Filtering

  23. FConclusion The share of malicious workers can be significantly reduced by making your task: Innovative Creative Non-repetitive Crowd Filtering can help to reduce the share of malicious workers at the cost of higher completion time. Previous acceptance rate is not a robust predictor of worker reliability

  24. VThank You!

  25. VQuestions, Remarks, Concerns? c.eickhoff@tudelft.nl

More Related