210 likes | 372 Views
Investigating Factors Influencing Crowdsourcing Tasks with High Imaginative Load. Raynor Vliegendhart Martha Larson Christoph Kofler Carsten Eickhoff (speaker) Johan Pouwelse. WSDM 2011 Workshop on Crowdsourcing for Search and Data Mining (CSDM 2011), Hong Kong, China, February 9–12, 2011.
E N D
Investigating Factors Influencing Crowdsourcing Tasks withHigh Imaginative Load Raynor VliegendhartMartha LarsonChristoph KoflerCarsten Eickhoff (speaker)Johan Pouwelse WSDM 2011 Workshop on Crowdsourcing for Search and Data Mining (CSDM 2011),Hong Kong, China, February 9–12, 2011.
OFirst Things First • That title sounds pretty esoteric. What is this all about?
OFirst Things First • That title sounds pretty esoteric. What is this all about? • We are dealing with two phenomena: • HIT titles that try to prepare the worker for the task • HITs that require the worker to project into different roles or situation • We refer to this property of HITs as “Imaginative Load”
OOutline • Why “Imaginative Load”? • Observations • Further Investigation • Conclusions
IWhy “Imaginative Load”? • Evaluation context: • Novel search-related feature for a file-sharing system • Term clouds as content descriptors • Required the workers to project themselves into the role of a user
RWhy “Imaginative Load”? The actual HIT was preceded by a recruitment step: Qualified Workers81 Evaluation HIT0/405 Recruitment HIT100/100 81
RWhy “Imaginative Load”? The turnout: Qualified Workers81 Evaluation HIT10/405 Recruitment HIT100/100 4
RWhy “Imaginative Load”? Perhaps we need more eligible workers? Qualified Workers160 Evaluation HIT12/405 Recruitment HIT100/100 5 79 Recruitment HIT100/100
IObservation I • HIT uptake is slow • Most workers do not do the actual HIT • If they do, then they don't do many iterations • Hypothesis: • “The recruitment task and the HIT titles were misleading. Once the workers realised what they were supposed to do they lost interest.”
EProjection Into Different Roles • Can workers guess which types of content are available? • HITs with and without term clouds • Several variations of term clouds
FProjection Into Different Roles Jim and his large circle of friends have a huge collection of files that they are sharing with a very popular file-sharing program. The file-sharing program is a make-believe program. Please imagine that it looks something like this sketch:
VQuestions If you could download one of these files, which one would it be? Why would you choose this particular file for download and viewing? Think again about the file that you chose.Why did you guess that Jim or one of his friends would have this file in their collection?
IObservation II • Some workers match literally between the mock-up frame and the questions • The majority of workers is able to generalize from the situation or the mock-up frame • Hypothesis: • “HIT design can enhance the workers' success at completing projection tasks.”
EFurther Investigation • Answer quality and HIT uptake of under the influence of • Title • Questionnaire design • 5 experimental conditions: • 5 HITs per condition • 10 workers per HIT • $0.10 reward per assignment
EFurther Investigation Title conditions • A: Jim, his friends and a make-believe file-sharing program • B: Jim, his friends and digital stuff to download • C: Jim, his friends and interesting stuff to download
RFurther Investigation • All HITs yielded serious results(two assignments rejected due to cheating) • Title A: more than 2 days to complete • Title B+C: each completed within a day
EFurther Investigation Question conditions (using Title B): • No preference questions at all, only an explanation of all given judgments • No justification of preference asked
RFurther Investigation Absence of preference questions: • Cut and paste strategies possible due to generality of question • Serious answers become more verbose to capture the generalized situation No explanation of preference: • Slight decrease in the degree to which the workers managed to project into Jim and his friends
CConclusions • “High imaginative load” tasks can be successfully run on MTurk • The key appears to be a combination of: • Signaling to workers the unique nature of the tasks(which are possibly quite different from those they generally choose)results in faster HIT uptake • Making each HIT require individualized free-text justification response improves the workers ability of projection and generalization