730 likes | 891 Views
Turkomatic. Collaboratively Crowdsourcing Complex Work With Turkomatic. Anand Kulkarni Björn Hartmann University of California, Berkeley Matthew Can Stanford University. Microtask marketplaces excel at simple, repetitive work.
E N D
Turkomatic Collaboratively Crowdsourcing Complex Work With Turkomatic Anand KulkarniBjörn Hartmann University of California, Berkeley Matthew Can Stanford University
Microtask marketplaces excel at simple, repetitive work. Transcribe a business card.
Microtask marketplaces excel at simple, repetitive work. Transcribe a business card. Look up a fact online.
Much of the work we do in our daily lives is not simple or repetitive. “Arrange my trip to Seattle.” “Create algebra problems for my mathematics exam.” “Write a research paper.” “Create a small piece of software.” “Write a blog about Mechanical Turk with a few good entries.”
Complex work with crowds Soylent: Editing word processing documents (Bernstein et al ’10) Vizwiz: Answering queries about visual scenes (Bigham et al ‘10) More complex applications: Platemate[NHZG11], Adrenaline [BBMK11], Crowdforge[KSK11]….
Workflows: Crowd Algorithms Divide complex tasks into a sequence of microtasks arranged in a workflow Soylent, Bernstein et al, UIST 2010
Workflow design is labor-intensive 1. Design individual HITs 2. Implement parallelism to make sure tasks are done correctly 3. Write software to launch HITs and parse worker results 4. Test workflow by running program 5. Identify errors 6. Iterate from step 1
Workflow design is labor-intensive Difficult and domain-specific: Workflow design requires extensive up-front iteration and experimentation and is specific to a given task domain. Inaccessible to non-experts: Few have the patience to implement this process in code
What is Turkomatic? Turkomatic is a system for crowdsourcing high-level complex and creative work where the crowd designs the workflow.
What is Turkomatic? Create a new blog about Mechanical Turk with two posts.
Price-Divide-Solve (PDS) How do we induce the crowd to design a workflow?
Price-Divide-Solve (PDS) PDS is a divide and conquer algorithm to create workflows. Price: Can this task be solved for 20 cents? If yes: Solve task and return the answer. If no: Divide task into multiple steps. For each step, recurse. Mergesteps into solution.
Price-Divide-Solve (PDS) PDS is a divide and conquer algorithm to create workflows. Price: Can this task be solved for 20 cents? If yes: Solvetask and return the answer. If no: Divide task into multiple steps. For each step, recurse. Mergesteps into solution.
Price-Divide-Solve (PDS) Redundancy is used at each step to ensure quality. Divide Task Best subdivision Price Task Price Task Vote Price check Consensus on price Price Task Price Task Majority Solve Task Best solution Price Task Price Task Vote
Price-Divide-Solve (PDS) Create a new blog about Mechanical Turk with two posts. Can we solve it for 20 cents? Price
Price-Divide-Solve (PDS) Create a new blog about Mechanical Turk with two posts. Can we solve it for 20 cents? No. Price
Price-Divide-Solve (PDS) Create a new blog about Mechanical Turk with two posts. Create a new blog on Wordpress.com. Can we solve it for 20 cents? Divide it into two or more steps. No. Divide Price Write a second entry for a blog. Write one entry for a blog.
Price-Divide-Solve (PDS) Create a new blog about Mechanical Turk with two posts. Create a new blog on Wordpress.com. Divide it into two or more steps. Divide Price Write a second entry for a blog. Write one entry for a blog.
Price-Divide-Solve (PDS) Create a new blog about Mechanical Turk with two posts. Create a new blog on Wordpress.com. Price Write a second entry for a blog. Write one entry for a blog. Can we solve it for 20 cents? Can we solve it for 20 cents? Can we solve it for 20 cents?
Price-Divide-Solve (PDS) Create a new blog about Mechanical Turk with two posts. Create a new blog on Wordpress.com. Price Write a second entry for a blog. Write one entry for a blog. Can we solve it for 20 cents? Can we solve it for 20 cents? Can we solve it for 20 cents? Yes. Yes. Yes.
Price-Divide-Solve (PDS) Create a new blog about Mechanical Turk with two posts. Create a new blog on Wordpress.com. Solve Write a second entry for a blog. Write one entry for a blog.
“Welcome to my blog about Mechanical Turk! Here, I’ll be posting some of my favorite recipes for Mechanical Turk. You’ll be able to follow along at home and create delicious HITs. From the comfort of your own home! Stay tuned and i’ll show you some of the best strategies for keeping your Turk workers engaged.” Price-Divide-Solve (PDS) Create a new blog about Mechanical Turk with two posts. Create a new blog on Wordpress.com. Solve Write a second entry for a blog. Write one entry for a blog.
“You may be inclined to price your HITs at the lowest possible rate, but this isn’t always the best choice. Instead, you should base your pricing on: -How long will the HIT take? -Is the HIT similar to other HITs? If so, price it slightly less than theirs. -If the HIT involves a lot of qualifications, you may want to price it higher, to attract more qualified workers.” Price-Divide-Solve (PDS) Create a new blog about Mechanical Turk with two posts. Create a new blog on Wordpress.com. Solve Write one entry for a blog. Write a second entry for a blog.
Price-Divide-Solve (PDS) Create a new blog about Mechanical Turk with two posts. Create a new blog on Wordpress.com. Combine the results of solved steps. Merge Write a second entry for a blog. Write one entry for a blog. mtworker.wordpress.com
Can this task be solved for 20 cents? Write a blog about Mechanical Turk Yes No Submit
Break down the following task. Write a blog about Mechanical Turk Step 1: Step 2: Add Step Submit
Solve the following task. Create a new blank blog on Wordpress Submit
Merge the following subtasks. Write a blog about Mechanical Turk Workers previously divided this task into simpler steps and solved each step. Combine their work into a complete solution. Step 1: Create a blank blog about Mechanical Turk [answer: www...] Step 2: Write a blog post about Mechanical Turk. [answer: This post is…] Submit
Price-Divide-Solve (PDS) PDS guides the crowd to design workflows in a particular way. It can attempt to create a workflow for any task, but it can’t produce all workflows. Write a sentence. Improve the previous worker’s answer. Check that the previous answer was improved.
System Recap Price Solve Divide Requester Interface Algorithm Algorithm Worker Interface System Output
Experiment 1: Can the crowd plan and execute workflows using PDS? Over 150 trials, including: • Java programming • Booking restaurants • Sorting and cleaning data • Blogging • Creating self-portraits • Solving an SAT • Logo design • Travel planning • Writing essays • Web research …
Experiment 1: Can the crowd plan and execute workflows using PDS? Over 150 trials, including: • Java programming • Booking restaurants • Sorting and cleaning data • Blogging • Creating self-portraits • Solving an SAT • Logo design • Travel planning • Writing essays • Web research …
Experiment 1: Success Modes Write a 3-paragraph essay about whether it’s ever OK to lie. Write one sentence to open the conclusion. Write 2-3 sentences in the middle of the conclusion. Write a concluding sentence. Write one paragraph arguing it’s OK to lie sometimes. Write one paragraph suggesting it’s never OK to lie. Write a conclusion reconciling the two.
Experiment 1: Success Modes Data: • 6 subnodes were produced • 44 separate worker judgments were used • Task completed with a full essay
Experiment 1: Success Modes “…although many people believe it is always essential to tell the truth, sometimes it may be better to lie. There is credibility in both views. And like many ethical decisions, sometimes the circumstances dictate. When you tell the truth you develop a stronger bond of trust with those around you. A relationship can not exist without trust. If you lie, you end up telling more lies to cover the first….”
Experiment 1: Failure Modes There are two ways we found that the algorithm could fail: -Failing to terminate at all -Completing, but producing wrong answers
Experiment 1: Failing to terminate Plan a trip from New York to S.F. that visits 5 interesting places. Think about where to go next in Ohio. Think about where to go next in Ohio.
Experiment 1: Wrong answers List the department chairs of the top 20 US programs in CS. • aalto armchair • poang lounge chair • adirondack chair • aeron chair • balans chair • ball chair • ….
Why does the crowd lose context? Turkomatic worker: “…I’ve taken a look at your instructions, and I understand them perfectly. However, this task seems to have been inadvertently sabotaged by other turkers who do not understand what you are asking them to do…”
Long workflows involve increasing chains of trust. Each individual worker has a ~30% probability of failure [Chi/Kittur/Suh ’08, Bernstein et al ’10]Weakest link problem: If one worker early in the workflow design process makes mistakes, the subsequent decompositions will fail.
One explanation What if we used more competent workers?
Experiment 2: Can expert workers make Turkomatic work? Setup:We recruited five graduate students with experience as requesters on Mechanical Turk. We ran the PDS algorithm on three complex tasks with this crowd: online research, essay writing, and creating a blog
Experiment 2: Can expert workers make Turkomatic work? Results: Each of three tested tasks completed correctly when we used only expert workers!
Experiment 2: Can expert workers make Turkomatic work? Results: Each of three tested tasks completed correctly when we used only expert workers! Conclusion: PDS works well with qualified crowds.