1 / 70

Research Case in Crowdsourcing

Understand research methods in crowdsourcing using comparison studies with binary and N-ary questions, exploring cost, latency, and accuracy trade-offs. Highlights case studies and challenges in crowd research projects. Discusses crowdsourced sort and select operations.

Download Presentation

Research Case in Crowdsourcing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Research Case in Crowdsourcing

  2. Learning Objectives • Understand an important research case in crowdsourcing • Identify research methods used in the research case

  3. Size of Comparison Binary question Cost Latency Accuracy Which is better? Which is the best? . . . N-ary question • Diverse forms of questions in a HIT • Different sizes of comparisons in a question

  4. Size of Batch Smaller b Which is the best? . . . Cost Latency Accuracy Which is the best? . . . Larger b Repetitions of questions within a HIT Eg, two n-ary questions (batch factor b=2)

  5. Response (r) W1 r = 1 r = 3 W2 W1 Which is better? Which is better? W3 Cost, Latency Accuracy Larger r Smaller r # of human responses seeked for a HIT

  6. Round (= Step) Round #1 Round #2 Which is better? Parallel Execution Which is better? Which is better? Sequential Execution Algorithms are executed in rounds # of rounds ≈ latency

  7. New Challenges http://www.info.teradata.com Latency Accuracy Cost • Open-world assumption (OWA) • Eg, workers suggest a new relevant image • Non-deterministic algorithmic behavior • Eg, different answers by the same or different workers • Trade-off among cost, latency, and accuracy

  8. Crowdsourcing DB Projects CDAS @ NUS CrowdDB @ UC Berkeley & ETH Zurich MoDaS @ Tel Aviv U. Qurk @ MIT sCOOP @ Stanford & UCSC

  9. Sort Operation • Rank N items using crowdsourcing with respect to the constraint C • Often C is subjective, fuzzy, ambiguous, and/or difficult-for-machines-to-compute • Eg, • Which image is the most “representative” one of Brazil? • Which animal is the most “dangerous”? • Which actress is the most “beautiful”?

  10. Sort Operation SELECT * FROM SoccerPlayers AS P WHERE P.WorldCupYear = ‘2014’ ORDER BY CrowdOp(‘most-valuable’) . . .

  11. Sort Operation Who is better? Who is better? Who is better? . . . . . . . . . . . . Who is better? Who is better? Who is better? . . . • Eg, “Which of two players is better?” • Naïve all pair-wise comparisons takes comparisons • Optimal # of comparison is O(N log N) .

  12. Sort Operation C A B • Conflicting opinions may occur • Cycle: A > B, B > C, and C > A • If no cycle occurs • Naïve all pair-wise comparisons takes comparisons • If cycle exists • More comparisons are required

  13. Sort [Marcus-VLDB11] • Proposed 3 crowdsourced sort algorithms • #1: Comparison-based Sort • Workers rank S items ( ) per HIT • Each HIT yields pair-wise comparisons • Build a directed graph using all pair-wise comparisons from all workers • If i > j, then add an edge from i to j • Break a cycle in the graph: “head-to-head” • Eg, If i > j occurs 3 times and i < j occurs 2 times, keep only i > j • Perform a topological sort in the DAG

  14. Sort [Marcus-VLDB11] 5 3 4 1 2 2 3 1 5 4 Error

  15. Sort [Marcus-VLDB11] A > > W1 A 1 1 1 B B E 1 1 > > W2 ✖ 1 1 1 1 C 2 1 > > W3 C D D 1 E > > W4 N=5, S=3

  16. Sort [Marcus-VLDB11] Sorted Result A A A ∨ Topological Sort B B B E ∨ C C ∨ D E C D ∨ E D DAG N=5, S=3

  17. Sort [Marcus-VLDB11] Mean rating 1.3 3.6 . . . . . . 8.2 • #2: Rating-based Sort • W workers rate each item along a numerical scale • Compute the mean of W ratings of each item • Sort all items using their means • Requires W*N HITs: O(N)

  18. Sort [Marcus-VLDB11]

  19. Sort [Marcus-VLDB11] • #3: Hybrid Sort • First, do rating-based sort  sorted list L • Second, do comparison-based sort on S ( ) • How to select the size of S • Random • Confidence-based • Sliding window

  20. Sort [Marcus-VLDB11] Rank correlation btw. Comparison vs. rating Worker agreement

  21. Sort [Marcus-VLDB11]

  22. Select Operation • Given N items, select k items that satisfy a predicate P • ≈Filter, Find, Screen, Search

  23. Select [Yan-MobiSys10] Improving mobile image search using crowdsourcing

  24. Select [Yan-MobiSys10] Ensuring accuracy with majority voting Given accuracy, optimize cost and latency Deadline as latency in mobile phones

  25. Select [Yan-MobiSys10] Goal: For a query image Q, find the first relevant image I with min cost before the deadline

  26. Select [Yan-MobiSys10] Parallel crowdsourced validation

  27. Select [Yan-MobiSys10] Sequential crowdsourced validation

  28. Select [Yan-MobiSys10] CrowdSearch: using early prediction on the delay and outcome to start the validation of next candidate early

  29. Select [Yan-MobiSys10]

  30. CountOperation • Given N items, estimate a fraction of items M that satisfy a predicate P • Selectivity estimation in DB  crowd-powered query optimizers • Evaluating queries with GROUP BY + COUNT/AVG/SUM operators • Eg, “Find photos of females with red hairs” • Selectivity(“female”) ≈ 50% • Selectivity(“red hair”) ≈ 2% • Better to process predicate(“red hair”) first

  31. CountOperation Q: “How many teens are participating in the Hong Kong demonstration?”

  32. CountOperation 10 - 56 10 - 37 15 - 29 http://www.faceplusplus.com/demo-detect/ Using Face++, guess the age of a person

  33. Count [Marcus-VLDB13] • Hypothesis: Humans can estimate the frequency of objects’ properties in a batch without having to explicitly label each item • Two approaches • #1: Label Count • Sampling based • Have workers label samples explicitly • #2: Batch Count • Have workers estimate the frequency in a batch

  34. Count [Marcus-VLDB13] Label Count (via sampling)

  35. Count [Marcus-VLDB13] Batch Count

  36. Count [Marcus-VLDB13] • Findings on accuracy • Images: Batch count > Label count • Texts: Batch count < Label count • Further Contributions • Detecting spammers • Avoiding coordinated attacks

  37. Top-1 Operation • Find the top-1, either MAX or MIN, among N items w.r.t. some criteria • Objective • Avoid sorting all N items to find top-1

  38. Max [Venetis-WWW12] Which is better? Which is the best? si = 2 ri = 3 si = 3 ri = 2 • Introduced two Max algorithms • Bubble Max • Tournament Max • Parameterized framework • si: size of sets compared at the i-th round • ri: # of human responses at the i-th round

  39. Max [Venetis-WWW12] • N = 5 • Rounds = 3 • # of questions = • r1 + r2 + r3 = 11 s1 = 2 r1 = 3 s2 = 3 r2 = 3 s3 = 2 r3 = 5 Bubble Max Case #1

  40. Max [Venetis-WWW12] • N = 5 • Rounds = 2 • # of questions = • r1 + r2= 8 s1 = 4 r1 = 3 s2 = 2 r2 = 5 Bubble Max Case #2

  41. Max [Venetis-WWW12] • N = 5 • Rounds = 3 • # of questions = r1 + r2 + r3 + r4 = 10 s1 = 2 r1 = 1 s3 = 2 r3 = 3 s2 = 2 r2 = 1 s4 = 2 r4 = 5 Tournament Max

  42. Max [Venetis-WWW12] • How to find optimal parameters?: si and ri • Tuning Strategies (using Hill Climbing) • Constant si and ri • Constant si and varying ri • Varying si and ri

  43. Max [Venetis-WWW12] • Bubble Max • Worst case: with si=2, O(N) comparisons needed • Tournament Max • Worst case: with si=2, O(N) comparisons needed • Bubble Max is a special case of Tournament Max

  44. Max [Venetis-WWW12]

  45. Max [Venetis-WWW12]

  46. Top-k Operation • Find top-k items among N items w.r.t. some criteria • Top-klist vs. top-kset • Objective • Avoid sorting all N items to find top-k

  47. Top-k Operation • Naïve solution is to “sort”N items and pick top-k items • Eg, N=5, k=2, “Find two best Bali images?” • Ask = 10 pair-wise questions to get a total order • Pick top-2 images

  48. Top-k: Tournament Solution (k = 2) Round 3 Total, 4 questions with 3 rounds Round 2 Round 1 • Phase 1: Building a tournament tree • For each comparison, only winners are promoted to the next round

  49. Top-k: Tournament Solution (k = 2) Round 3 Round 2 Round 1 • Phase 2: Updating a tournament tree • Iteratively asking pair-wise questions from the bottom level

  50. Top-k: Tournament Solution (k = 2) Round 5 Total, 6 questions With 5 rounds Round 4 • Phase 2: Updating a tournament tree • Iteratively asking pair-wise questions from the bottom level

More Related