1 / 51

Crowdsourcing Complexity: Lessons for Software Engineering

Crowdsourcing Complexity: Lessons for Software Engineering. Lydia Chilton 2 June 2014 ICSE Crowdsourcing Workshop. Clarification: Human Computation. Mechanical Turk Microtasks :. 2007: JavaScript Calculator. 2007: JavaScript Calculator. Evolution of Complexity in Human Computation.

tracey
Download Presentation

Crowdsourcing Complexity: Lessons for Software Engineering

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Crowdsourcing Complexity:Lessons for Software Engineering Lydia Chilton 2 June 2014 ICSE Crowdsourcing Workshop

  2. Clarification: Human Computation Mechanical Turk Microtasks:

  3. 2007: JavaScript Calculator

  4. 2007: JavaScript Calculator

  5. Evolution of Complexity in Human Computation Task Decomposition: Cascade & Frenzy

  6. Evolution of Complexity

  7. 1. Collective Intelligence 1906: 787 aggregated votes averaged 1197 lbs. Actual answer: 1198 lbs.

  8. 1. Collective Intelligence Principles: • Small tasks • Intuitive tasks • Independent answers • Simple aggregation Application: - ESP Game

  9. 2. Iterative Workflows work improve vote improve vote Collective Intelligence

  10. 2. Iterative Workflows Principles: • Use fresh eyes • Vote to ensure improvement Application: - Bug finding “given enough eyeballs, all bugs are shallow”

  11. 3. Psychological Boundaries

  12. 3. Psychological Boundaries Applications: • Manager / programmer • Writer / editor • Write code / test code • Addition / subtraction Principle: • Task switching is hard • Natural boundaries for tasks

  13. 4. Task Decomposition Legion:Scribe Real Time Audio Captioning on MTurk

  14. 4. Task Decomposition Principles: • Must be able to break apart tasks AND put them back together. • Complex aggregation • Hint: Solve backwards. Find what people can do, and build up from there.

  15. 5. Worker Choice Mobi: Trip Planning on Mturk with an open UI.

  16. 5. Worker Choice Applications: • Trip planning • Conference time table • Conference session-making Principles: • Giving workers freedom relieves requesters’ burden of task decomposition. • Workers feel more involved and empowered. • BUT complex interface that is difficult to scale.

  17. 6. Learning and Doing

  18. 6. Learning and Doing Applications: • Peer assessment • Do grading assignments before you do your own assignment • Task Feedback Principles: • Teaching workers makes them better. • How long will they stay?

  19. Lessons for Software Engineering • Propose and vote • Find natural psychological boundaries between tasks • Find the tasks people can do, then assemble them using complex aggregation techniques. • Teach. 221 + 473 -221 + 473

  20. Evolution of Complexity in Human Computation Task Decomposition: Cascade & Frenzy

  21. Task decomposition is the key to crowdsourcing software engineering

  22. Cascade Crowdsourcing Taxonomy Creation • Lydia Chilton (UW), Greg Little (oDesk), Darren Edge (MSR Asia), • Dan Weld (UW), James Landay (UW)

  23. Problem

  24. 1000 eGovernment suggestions • 50 top product reviews • 100 employee feedback comments • 1000 answers to “Why did you decide to major in Computer Science?” Machines can’t analyze it People don’t have time to analyze it • time consuming • overwhelming • no right answer

  25. Solution

  26. Solution: Crowdsourced Taxonomies

  27. Toy Application: Colors

  28. Initial Prototypes

  29. Iterative Improvement Problems The hierarchy grows and becomes overwhelming Workers have to decide what to do Lesson Break up the task more

  30. Initial Approach 2:Category Comparison • Problem • Without context, it’s hard to judge relationships • flying vs. flights • TSA liquids vs. removing liquids • Packing vs. what to bring • Lesson • Don’t compare abstractions to abstractions • Instead compare data to abstractions

  31. Use Lesson #3 Find the tasks people can do. Assemble them using complex aggregation techniques. Categorize Select Best Labels Generate Labels

  32. Cascade Algorithm For a subset of items Generate Labels Select Best Labels {good labels} Categorize For all items, for all good labels, Then recurse

  33. Aggregate Data into Taxonomy redundant • Blue: • Light Blue: • Green: • Other: nested Green Blue Light Blue singletons

  34. Cascade Results: 100 Colors

  35. How can we get a global picture from workers who see only subsets of the data?

  36. Propose, Vote, Test Workers have good heuristics. Let them propose categories. Vote on categories to weed out bad ones. Test the heuristics by verifying it on data. Propose Vote Test

  37. Lesson Propose, Vote, Test.

  38. Deploy Cascade to Real Needs • CHI 2013 Program Committee Organize 430 accepted papers to help session making • 40 CrowdCamp Hack-a-thon Participants Organize 100 hack-a-thon ideas to help organize teams

  39. 430 CHI Papers: Good Results, but… Patina: Dynamic Heatmaps for Visualizing Application Usage', Effects of Visualization and Note-Taking  on Sensemaking and Analysis', Contextifier: Automatic Generation of Messaged Visualizations', Interactive Horizon Graphs: Improving the Compact Visualization of Multiple Time Series', Quantity Estimation in Visualizations of Tagged Text', Motif Simplification: Improving Network Visualization Readability with Fan, Connector, and Clique Glyphs', Evaluation of Alternative Glyph Designs for Time Series Data in a Small Multiple Setting', Individual User Characteristics and Information Visualization: Connecting the Dots through Eye Tracking', "Without the Clutter of Unimportant Words": Descriptive Keyphrases for Text Visualization']], Direct Space-Time Trajectory Control  for Visual Media Editing Your eyes will go out of the face: Adaptation for virtual eyes in video see-through HMDs Swifter: Improved Online Video Scrubbing Direct Manipulation Video Navigation in 3D NoteVideo: Facilitating Navigation of Blackboard-style Lecture Videos Ownership and Control of Point of View in Remote Assistance EyeContext: Recognition of High-level Contextual Cues from Human Visual Behaviour Your eyes will go out of the face: Adaptation for virtual eyes in video see-through HMDs Still Looking: Investigating Seamless Gaze-supported Selection, Positioning, and Manipulation of Distant Targets Individual User Characteristics and Information Visualization: Connecting the Dots through Eye Tracking Quantity Estimation in Visualizations of Tagged Text • Visualization (19) • evaluating infovis(9) • text (2) • video (6) • visualizing time data (5) • gaze (4) • gaze tracking (3) • user requirements (3) • color schemes (2)

  40. “Don’t treat me like a Turker.” “I just want to see all the data”

  41. Lesson Authority and Responsibility should be aligned.

  42. Frenzy: Collaborative Data Organization for Creating Conference Sessions Lydia Chilton (UW), Juho Kim (MIT), Paul Andre (CMU), Felicia Cordeiro (UW), James Landay (Cornell?), Dan Weld (UW), Steven Dow (CMU), Rob Miller (MIT), Haoqi Zhang (NW)

  43. Groupware Creating conference sessions is a social process. Grudin: Social process are often guided by personalities, tradition, convention. Challenge: support to the process without seeking to replace these behaviors. Challenge: remain flexible and do not improve rigid structures.

  44. DEMO

  45. Light-weight contributions Label Vote Categorize

  46. 2-Stage Workflow Stage 1 Stage 2 Set-up • Collect Meta Data • 60 PC members • Low authority • Low responsibility • Session Making • 11 PC members • High authority • High responsibility

  47. Goals Collect data: labels, votes Session-Making

  48. Results Sessions created in record-setting 88 minutes.

More Related