240 likes | 528 Views
Crowdscreen : Algorithms for Filtering Data using Humans. Aditya Parameswaran Stanford University (Joint work with Hector Garcia-Molina, Hyunjung Park, Neoklis Polyzotis , Aditya Ramesh , and Jennifer Widom ). Crowdsourcing : A Quick Primer. Asking the crowd for help to solve problems.
E N D
Crowdscreen: Algorithms for Filtering Data using Humans AdityaParameswaran Stanford University (Joint work with Hector Garcia-Molina, Hyunjung Park, NeoklisPolyzotis, AdityaRamesh, and Jennifer Widom)
Crowdsourcing: A Quick Primer Asking the crowd for help to solve problems Why? Many tasks done better by humans • Is this a photo of a car? • Pick the “cuter” cat How? We use an internet marketplace Requester: Aditya Reward: 1$ Time: 1 day
Crowd Algorithms • Working on fundamental data processing algorithms that use humans: • Max [SIGMOD12] • Filter [SIGMOD12] • Categorize [VLDB11] • Cluster [KDD12] • Search • Sort • Using human unit operations: • Predicate Eval., Comparisons, Ranking, Rating Goal: Design efficient crowd algorithms
Efficiency: Fundamental Tradeoffs • Which questions do I ask humans? • Do I ask in sequence or in parallel? • How much redundancy in questions? • How do I combine the answers? • When do I stop? How long can I wait? Latency Uncertainty What is the desired quality? Cost How much $$ can I spend?
Filter Single Is this an image of Paris? Predicate 1 Dataset of Items Is the image blurry? Filtered Dataset Predicate 2 Predicate Does it show people’s faces? …… Predicate k Y Y N Item X satisfies predicate? Applications: Content Moderation, Spam Identification, Determining Relevance, Image/Video Selection, Curation, and Management, …
Parameters • Given: • Per-question human error probability (FP/FN) • Selectivity • Goal: Compose filtering strategies, minimizing across all items • Overall expected cost (# of questions) • Overall expected error Latency Uncertainty Cost
Our Visualization of Strategies continue decide PASS YESs decide FAIL 6 5 4 3 2 1 6 5 4 1 2 3 NOs
Common Strategies • Always ask X questions, return most likely answer • Triangular strategy • If X YES return “Pass”, Y NO return “Fail”, else keep asking. • Rectangular strategy • Ask until |#YES - #NO| > X, or at most Y questions • Chopped off triangle
Filtering: Outline • How do we evaluate strategies? • Hasn’t this been done before? • What is the best strategy? (Formulation 1) • Formal statement • Brute force approach • Pruning strategies • Probabilistic strategies • Experiments • Extensions
Evaluating Strategies Cost = (x+y) Pr. of reaching (x,y) Error = Pr. of reaching (x,y) and incorrectly filtered ∑ YESs ∑ 3 2 1 Pr. of reaching (x, y) = Pr. of reaching (x, y-1) and getting Yes + Pr. of reaching (x-1, y) and getting No 3 2 1 NOs
Hasn’t this been done before? • Solutions from elementary statistics guarantee the same error per item • Important in contexts like: • Automobile testing • Medical diagnosis • We’re worried about aggregate error over all items: a uniquely data-oriented problem • We don’t care if every item is perfect as long as the overall error is met. • As we will see, results in $$$ savings
What is the best strategy? Find strategy with minimum overall expected cost, such that • Overall expected error is less than threshold • Number of questions per item never exceeds m YESs 6 5 4 3 2 1 6 5 4 1 2 3 NOs
Brute Force Approaches Too Long! • Try all O(3p) strategies, p = O(m2) • Try all “hollow” strategies Too Long! YESs YESs 4 4 3 3 2 2 1 1 NOs 6 5 4 1 2 3 NOs 4 1 2 3
Pruning Hollow Strategies For every hollow strategy, there is a ladder strategy that is as good or better. YESs 4 3 2 1 6 NOs 5 4 1 2 3
Other Pruning Examples YESs YESs 6 6 5 5 4 4 3 3 2 2 1 1 Hollow 6 6 5 5 4 4 1 1 2 2 3 3 Ladder NOs NOs
Probabilistic Strategies • Probabilities: • continue(x, y), pass(x, y), fail(x, y) YESs (0,1,0) (0,1,0) (0,1,0) 3 (0.5,0.5,0) (0.5,0.5,0) (1,0,0) (0,0,1) 2 (1,0,0) (1,0,0) (1,0,0) (0,0,1) 1 (1,0,0) (0.5,0,0.5) (0,0,1) (1,0,0) 3 2 1 NOs
Best probabilistic strategy • Finding best strategy can be posed as a Linear Program! • Insight 1: • Pr of reaching (x, y) = Paths into (x, y) * Pr. of one path • Insight 2: • Probability of filtering incorrectly at a point is independent of number of paths • Insight 3: • At least one of pass(x, y) or fail(x, y) must be 0
Experimental Setup • Goal: Study cost savings of probabilistic relative to others • Parameters Generate Strategies Compute Cost • Two sample plots • Varying false positive error (other parameters fixed) • Varying selectivity (other parameters varying) Probabilisitic Deterministic Hollow Ladder Rect Growth Shrink
Other Issues and Factors • Other formulations • Multiple filters • Categorize (output >2 types) Ref: “Crowdscreen: Algorithms for filtering with humans” [SIGMOD 2012]
Natural Next Steps • Expertise • Spam Workers • Task Difficulty • Latency • Error Models • Pricing Skyline of cost, latency, error Algorithms
Related Work on Crowdsourcing • Workflows, Platforms and Libraries: Turkit [Little et al. 2009], HProc [Heymann 2010], CrowdForge [Kittur et al. 2011], Turkomatic [Kulkarni and Can 2011], TurKontrol/Clowder [Dai, Mausam and Weld 2010-11] • Games: GWAP, Matchin, Verbosity, Input Agreement, Tagatune, Peekaboom [Von Ahn & group 2006-10], Kisskissban [Ho et al. 2009], Foldit [Cooper et. al. 2010-11], Trivia Masster [Deutch et al. 2012] • Marketplace Analysis: [Kittur et al. 2008], [Chilton et al. 2010], [Horton and Chilton 2010], [Ipeirotis 2010] • Apps: VizWiz [Bigham et al. 2010], Soylent [Bernstein et al. 2010], ChaCha, CollabMap [Stranders et al. 2011], Shepherd [Dow et al. 2011] • Active Learning: Survey [Settles 2010],[Raykar et al. 2009-10], [Sheng et al. 2008], [Welinder et al. 2010], [Dekel 2010], [Snow et al. 2008], [Shahaf 2010], [Dasgupta, Langford et al. 2007-10] • Databases: CrowdDB [Franklin et al. 2011], Qurk [Marcus et al. 2011], Deco [Parameswaran et. al. 2011], Hlog [Chai et al., 2009] • Algorithms: [Marcus et al. 2011], [Gomes et al. 2011], [Ailon et al. 2008], [Karger et al. 2011],
Thanks for listening! Questions? SCHRÖDINGER’S CAT