1 / 48

Generative Models for Crowdsourced Data

Generative Models for Crowdsourced Data. Outline. What is Crowdsourcing ? Modeling the labeling process Example with real data Extensions Future Directions. What is Crowdsourcing ?. Human based computation. Outsourcing certain steps of a computation to humans.

yetty
Download Presentation

Generative Models for Crowdsourced Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Generative Models for Crowdsourced Data

  2. Outline • What is Crowdsourcing? • Modeling the labeling process • Example with real data • Extensions • Future Directions

  3. What is Crowdsourcing? • Human based computation. • Outsourcing certain steps of a computation to humans. • ``Artificial artificial intelligence.’’ • Data science: • Making an immediate decision. • Creating a labeled data set for learning.

  4. Immediate Decision Workflow

  5. Labeled Data Set Workflow

  6. An Example HIT

  7. An Example HIT

  8. Funny enough … • Not everybody agrees on the gender of a Twitter profile. • Difficult Instances • Worker Ability / Motivation • Worker Bias • AdversarialBehaviour

  9. Difficult Instance

  10. Difficult Instance

  11. Difficult Instance

  12. Worker Ability

  13. Worker Ability

  14. Worker Ability

  15. Worker Motivation

  16. Worker Motivation

  17. Worker Bias

  18. Worker Bias

  19. Worker Bias

  20. Disagreements • When some workers say “male” and some workers say “female”, what to do?

  21. Majority Rules Heuristic • Assign label l to item x if a majority of workers agree. • Otherwise item x remains unlabeled.

  22. Majority Rules Heuristic • Assign label l to item x if a majority of workers agree. • Otherwise item x remains unlabeled. • Ignores prior worker data.

  23. Majority Rules Heuristic • Assign label l to item x if a majority of workers agree. • Otherwise item xremains unlabeled. • Ignores prior worker data. • Introduce bias in labeled data.

  24. Train on all labels • For labeled data set workflow. • Add all item-label pairs to the data set. • Equivalent to cost vector of: • P (l | { lw}) = 1/nwS 1{l = lw}

  25. Train on all labels • For labeled data set workflow. • Add all item-label pairs to the data set. • Equivalent to cost vector of: • P (l | { lw}) = 1/nwS1{l = lw} • Ignores prior worker data.

  26. Train on all labels • For labeled data set workflow. • Add all item-label pairs to the data set. • Equivalent to cost vector of: • P (l | { lw}) = 1/nwS1{l = lw} • Ignores prior worker data. • Models the crowd, not the “ground truth.”

  27. What is ground truth • Different theoretical approaches. • PAC learning with noisy labels. • Fully-adversarial active learning. • Bayesians have been very active. • “Easy” to posit a functional form and quickly develop inference algorithms. • Issue of model correctness is ultimately empirical.

  28. Bayesian Literature • (2009) Whitehill et. al. GLAD framework. • (1979) Dawid and Skene. Maximum Likelihood Estimation of Observer Error-Rates Using the EM Algorithm. • (2010) Welinder et. al. The Multidimensional Wisdom of Crowds. • (2010) Raykar et. al. Learning from Crowds.

  29. Bayesian Approach • Define ground truth via a generative model which describes how “ground truth” is related to the observed output of crowdsource workers. • Fit to observed data. • Extract posterior over ground truth. • Make decision or train classifier.

  30. Generative Model

  31. Example: Binary Classification • Each worker has a matrix. α = ( -1 α01) ( α10 -1 ) • Each item has a scalar difficulty β > 0. • P (lw = j | z = i) = e-βαij / (Σk e-βαik) • αij ~ N (μij, 1) ; μij ~ N (0, 1) • log β ~ N (ρ, 1) ; ρ ~ N (0, 1)

  32. Other Problems • Multiclass classification: • Same as binary with larger confusion matrix. • Ordinal classification: (“Hot or not”) • Confusion matrix has special form. • O (L) parameters instead of O (L2). • Multilabel classification: • Reduce to multiclass on power set. • Assume low-rank confusion matrix.

  33. EM

  34. EM • Initially all workers are assumed moderately accurate and without bias. • Implies initial estimate of ground truth distribution favors consensus. • Disagreeing with the majority is a likely error.

  35. EM • Initially all workers are assumed moderately accurate. • Workers consistently in the minority have their confusion probabilities increase.

  36. EM • Initially all workers are assumed moderately accurate. • Workers consistently in the minority have their confusion probabilities increase. • Workers with higher confusion probabilities contribute less to the distribution of ground truth.

  37. “Different” workers are marginalized

  38. “Different” workers are marginalized • Workers that are consistently in the minority will not contribute strongly to the posterior distribution over ground truth. • Even if they are actually more accurate. • Can correct when an accurate worker(s) is paired with some inaccurate workers. • Good for breaking ties. • Raykar et. al.

  39. Example with real data

  40. Online EM • Given a set of worker-label pairs for a single item: • (Inference) Using current α, find most likely β* and distribution q* over ground truth. • (Training) Do SGD update of α with respect to EM auxiliary function evaluated at β* and q*.

  41. Online EM • Given a set of worker-label pairs for a single item: • (Inference) Using current α, find most likely β* and distribution q* over ground truth. • (Training) Do SGD update of α with respect to EM auxiliary function evaluated at β* and q*.

  42. Things to do with q* • Take an immediate cost-sensitive decision • d* = argmindEz~q*[f (z, d)] • Train a (importance-weighted) classifier • cost vector cd = Ez~q*[f (z, d)] • e.g. 0/1 loss: cd = (1 - q*d) • e.g. binary 0/1 loss: |c1 – c0| = |1 – 2 q*1| • No need to decide what the true label is! • Raykar et. al.: why not jointly estimate classifier and worker confusion?

  43. Raykar et. al. insight • Cost vector is constructed by estimating worker confusion matrices. • Subsequently, classifier is trained; it will sometimes disagree with workers. • Would be nice to use that disagreement to inform the worker confusion matrices. • Circular dependency suggests joint estimation.

  44. Generative Model

  45. Generative Model

  46. Online Joint Estimation

  47. Online Joint Estimation • Initially the classifier will output an uninformative prior and therefore will be trained to follow consensus of workers. • Eventually workers which disagree with the classifier will have their confusion probabilities increase. • Workers consistently in the minority can contribute strongly to the posterior if they tend to agree with the classifier.

  48. Additional Resources • Software • http://code.google.com/p/nincompoop • Blog • http://machinedlearnings.com/

More Related