1 / 99

Learning from the Crowd: Collaborative Filtering Techniques for Identifying On-the-Ground Twitterers during Mass Disru

Kate Starbird University of Colorado Boulder, ATLAS Institute. Learning from the Crowd: Collaborative Filtering Techniques for Identifying On-the-Ground Twitterers during Mass Disruptions. Grace Muzny University of Washington, Computer Science. Leysia Palen

kezia
Download Presentation

Learning from the Crowd: Collaborative Filtering Techniques for Identifying On-the-Ground Twitterers during Mass Disru

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Kate Starbird • University of Colorado Boulder, ATLAS Institute • Learning from the Crowd: • Collaborative Filtering Techniques for • Identifying On-the-Ground Twitterers during Mass Disruptions • Grace Muzny • University of Washington, Computer Science • Leysia Palen • University of Colorado Boulder, Computer Science

  2. Social Media & Mass Disruption Events

  3. Social Media & Mass Disruption Events

  4. Sociologists of disaster: After a disaster event, people will converge on the scene to, among other things, offer help

  5. Spontaneous volunteers Sociologists of disaster: After a disaster event, people will converge on the scene to, among other things, offer help

  6. Social Media & Mass Disruption Events Mass Disruption = Mass Convergence

  7. Opportunities for Digital Convergence

  8. Opportunities for Digital Convergence Citizen Reporting

  9. Opportunities for Digital Convergence Citizen Reporting Challenges of Digital Convergence

  10. Opportunities for Digital Convergence Citizen Reporting Challenges of Digital Convergence Volume

  11. Opportunities for Digital Convergence Citizen Reporting Challenges of Digital Convergence Volume Noise

  12. Opportunities for Digital Convergence Citizen Reporting Challenges of Digital Convergence Volume Noise Misinformation & Disinformation

  13. Opportunities for Digital Convergence Citizen Reporting Crowd Work! Challenges of Digital Convergence Volume Noise Misinformation & Disinformation

  14. Signal

  15. Signal

  16. Signal Noise?

  17. Signal

  18. Signal Starbird, K., Palen, L., Hughes, A.L., & Vieweg, S. (2010). Chatter on The Red: What Hazards Threat Reveals about the Social Life of Microblogged Information. CSCW 2010

  19. Original Information Signal Starbird, K., Palen, L., Hughes, A.L., & Vieweg, S. (2010). Chatter on The Red: What Hazards Threat Reveals about the Social Life of Microblogged Information. CSCW 2010

  20. Original Information First hand info New info coming in to the space for the first time Signal Starbird, K., Palen, L., Hughes, A.L., & Vieweg, S. (2010). Chatter on The Red: What Hazards Threat Reveals about the Social Life of Microblogged Information. CSCW 2010

  21. Derivative Behavior Original Information First hand info Re-sourced Info Reposts Links/URLs Network Connections New info coming in to the space for the first time Signal Starbird, K., Palen, L., Hughes, A.L., & Vieweg, S. (2010). Chatter on The Red: What Hazards Threat Reveals about the Social Life of Microblogged Information. CSCW 2010

  22. use to find Derivative Behavior Original Information First hand info Re-sourced Info Reposts Links/URLs Network Connections New info coming in to the space for the first time Signal

  23. use to find Derivative Behavior Original Information follow RT RT @mention RT RT @mention RT RT Signal follow @mention RT follow follow RT RT @mention RT @mention follow RT

  24. use to find Derivative Behavior Original Information follow RT RT @mention RT RT @mention RT RT Signal follow @mention RT follow follow RT RT @mention RT @mention follow RT

  25. use to find Derivative Behavior Original Information follow RT RT @mention RT RT @mention RT RT Signal follow @mention RT follow follow RT RT @mention RT @mention follow RT

  26. use to find Derivative Behavior Original Information follow RT RT Collaborative Filtering @mention RT RT @mention RT RT Signal follow @mention RT follow follow RT RT @mention RT @mention follow RT

  27. use to find Derivative Behavior Original Information follow RT RT Collaborative Filtering @mention RT RT @mention RT RT Signal follow @mention RT follow follow RT RT @mention RT @mention follow RT Crowd Work

  28. Learning from the Crowd: A Collaborative Filter for Identifying Locals

  29. Learning from the Crowd: A Collaborative Filter for Identifying Locals • Background • Why Identify Locals? • Empirical Study on Crowd Work during Egypt Protests • Test Machine Learning Solution for Identifying Locals • Event - Occupy Wall Street in NYC • Data Collection & Analysis • Findings • Discussion • Leveraging Crowd Work • From Empirical Work to Computational Solutions

  30. Why Help Identify Locals? • Citizen Reporting: first hand info can contribute to situational awareness • Info not already in the larger information space • Digital volunteers often work to identify and create lists of on-the-ground Twitterers

  31. Why Help Identify Locals? • Crisis events vs. protest events

  32. Why Help Identify Locals? • Crisis events vs. protest events • Tunisia Protests - activists tweeting from the ground were a valuable source of info for journalists (Lohan, 2011) • Egypt Protests - protestors on the ground were actively fostering solidarity from the remote crowd • (Starbird and Palen, 2012)

  33. Why Help Identify Locals? • Occupy Wall Street (OWS) Protests: Protestors on the ground wanted to publicize their numbers, foster solidarity with the crowd, and solicit assistance • @jeffrae: We could really use a generator down here at Zuccotii Park. Can anyone help? #occupyWallStreet #takewallst #Sept17

  34. Why Help Identify Locals? • Occupy Wall Street (OWS) Protests: Protestors on the ground wanted to publicize their numbers, foster solidarity with the crowd, and solicit assistance • @jeffrae: We could really use a generator down here at Zuccotii Park. Can anyone help? #occupyWallStreet #takewallst #Sept17 • OWS Protests: Remote supporters aggregated and published lists of those on the ground • @CassProphet: Follow on-scene @AACina @Jeffrae @DhaniBagels @Korgasm_ @brettchamberlin #TakeWallStreet #OurWallStreet #OccupyWallStreet #yeswecamp • @djjohnso: We have 20 livetweeters for this list. Are there others? @djjohnso/occupywallstreetlive #takewallstreet #OurWallStreet #needsoftheoccupiers

  35. Empirical Study of Crowd Work during Political Protests

  36. Learning from the Crowd Empirical Study of Crowd Work during 2011 Egypt Revolution • something • something else • some more • Collected #egypt #jan25 tweets • 2,229,129 tweets • 338,895 Twitterers • Identified most-RTed Twitterers • Determined location for sample

  37. Learning from the Crowd Empirical Study of Crowd Work during 2011 Egypt Revolution • Crowd may work to identify on-the-ground Twitterers

  38. Learning from the Crowd Empirical Study of Crowd Work during 2011 Egypt Revolution • Crowd may work to identify on-the-ground Twitterers • Identified several recommendation and user behavior features that had significant relationships to being “on the ground”

  39. Learning from the Crowd Empirical Study of Crowd Work during 2011 Egypt Revolution • Crowd may work to identify on-the-ground Twitterers • Identified several recommendation and user behavior features that had significant relationships to being “on the ground” • More times retweeted = more likely to be on the ground

  40. Learning from the Crowd Empirical Study of Crowd Work during 2011 Egypt Revolution • Crowd may work to identify on-the-ground Twitterers • Identified several recommendation and user behavior features that had significant relationships to being “on the ground” • More times retweeted = more likely to be on the ground • More unique retweets = more likely to be on the ground

  41. Learning from the Crowd Empirical Study of Crowd Work during 2011 Egypt Revolution • Crowd may work to identify on-the-ground Twitterers • Identified several recommendation and user behavior features that had significant relationships to being “on the ground” • More times retweeted = more likely to be on the ground • More unique retweets = more likely to be on the ground • More followers at beginning of event = less likely to be on the ground

  42. Learning from the Crowd Empirical Study of Crowd Work during 2011 Egypt Revolution • Crowd may work to identify on-the-ground Twitterers • Identified several recommendation and user behavior features that had significant relationships to being “on the ground” • More times retweeted = more likely to be on the ground • More unique retweets = more likely to be on the ground • More followers at beginning of event = less likely to be on the ground Feature not available in tweet metadata. Identified through qualitative analysis, then calculated and evaluated through quantitative analysis.

  43. Goal: Test Viability of a Machine Learning Solution to Identify Locals using Crowd Recommendation Behavior

  44. Goal: Test Viability of a Machine Learning Solution to Identify Locals using Crowd Recommendation Behavior Move from Empirical Work to Computational Solution

  45. Event: Occupy Wall Street Protests September 15-21, 2011 NYC site - Zuccotti Park

  46. Data Collection and Sampling • 270,508 Tweets - Search API, Streaming API

  47. Data Collection and Sampling • 270,508 Tweets - Search API, Streaming API

  48. Data Collection and Sampling • 270,508 Tweets - Search API, Streaming API • 53,296 Total Twitterers

  49. Data Collection and Sampling • 270,508 Tweets - Search API, Streaming API • 53,296 Total Twitterers • 23,847 Twitterers sent >= 2 tweets • allowing us to capture profile change

  50. Data Collection and Sampling • 270,508 Tweets - Search API, Streaming API • 53,296 Total Twitterers • 23,847 Twitterers sent >= 2 tweets • allowing us to capture profile change • Tweets from Streaming API contain Twitter profile information

More Related