1 / 31

The Power of Crowdsourcing: What can (and cannot) be predicted with social Media

The Power of Crowdsourcing: What can (and cannot) be predicted with social Media. CS 315 – Web Search and Data Mining. Overview. The power of crowdsourcing Predicting flu outbreaks Predicting “ the present ” through Google Insights! Predicting movie success! Predicting elections!

tamma
Download Presentation

The Power of Crowdsourcing: What can (and cannot) be predicted with social Media

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Power of Crowdsourcing: What can (and cannot) be predicted with social Media CS 315 – Web Search and Data Mining

  2. Overview The power of crowdsourcing Predicting flu outbreaks Predicting “the present” through Google Insights! Predicting movie success! Predicting elections! Predicting elections? What can (and cannot) be predicted How (not to) predict

  3. Tracking Seasonal Flu through the CDC.gov Map taken on April 18 -> Based on reports from Hospitals Takes a couple of weeks to record

  4. google.org/flutrends/us Map taken on April 18 -> Based on keywords being searched It is updated immediately Data can be downloaded, studied

  5. Why does it work so well? “close relationship between how many people search for flu-related topics and how many people actually have flu symptoms”

  6. Google Trends predicts flu outbreak!

  7. Observing the crowd It makes sense: People search about things they want to be informed about, including flu symptoms Another example: Which day of the week there are the most queries with the term “hangover” in?

  8. Observing the crowd It makes sense: People search about things they want to be informed about, including flu symptoms Another example: Which day of the week there are the most queries with the term “hangover” in? “Civil war” what do you expect to see?

  9. Predicting “the future" Geography Time window Category • Sample data • Not identical when repeated • Preserve privacy • Normalized data • Peak at 100% • You can disambiguate • Apple in computer & electronics • Apple in food & drink • Downloadable • Must be logged in

  10. Basic Econometrics Forecasting Models • Autoregressive: value at time t depends on Value at time t-1 • Seasonal adjustment: value at time t depends on Value at time t-12 • Transfer function: value at time t depends on other contemporaneous or lagging variables • Seasonal autoregressive transfer model: Value at time t depends on • Value at time t-12 (seasonality)‏ • Value at time t-1 (recent behavior)‏ • Other lagging or contemporaneous variables (such as Google Trends data)‏ • Typical question of interest • How much more accurate forecasts can you get from additional variables over and above the accuracy you get with the history of the time series itself? 10

  11. Analysis and Forecasting Method: Fit other data as best you can, then add Trends data, improve prediction Model: Yt = 446.1 + 0.864 * Yt - 1 – 4.340 * us378.1 + 4.198 * us96.2 – 0.001 * AvgPt – 1 Yt: New house sold at t-th month AvgPt – 1: Average Sales Price of New One-Family Houses Sold at (t-1)-th month us378.1: Google Trend of vertical id = 378 (Rental Listings & Referrals ) at t-th month 1st week us96.2: Google Trend of vertical id = 96 (Real Estate Agent) at t-th month 2nd week July 2008 Actual = 515K Predicted = 442.98K Z-score = 2.53 August 2008 Prediction = 417.52K

  12. Google Trends “can predict the present”

  13. Predicted with Google Trends Home sales Movie box-office success Product sales (e.g., video games) Travel to Hong Kong Unemployment rates …Consumer behavior, in general? (Goel paper) Is there anything that could NOT be predicted with Google Trends? Is Twitter chat volume as good?

  14. Twitter Predicts Movie Box-Office Sales!

  15. Movie buzz creates tweets… The rate at which movie tweets are generatedcan be used to build a powerful modelfor predicting movie box-office revenue,(better than “gold-standard” Hollywood Stock Exch.) Tweet-rate(movie) = tweets(movie)/hour Predictions (linear regression):7-days before release datathent: #theaters playingHSX index

  16. Twitter monitors Poll Sentiment (!) For more information, see “oconnor – tweets to polls AAPOR panel.ppt”

  17. Smoothed (15 days) comparisonsSentimentRatio(”jobs”)

  18. US Presidential elections not predicted • 2008 elections • SR(“obama”) and SR(“mccain”) sentiment do not correlate • But, “obama” and “mccain” volume: r = .79, .74 (!) • Simple indicator of election news? • 2009 job approval • SR(“obama”): r = .72 • Looks easier: simple decline

  19. In the meantime, in Germany…

  20. Twitter can Predict Elections (?!) For more info, see “icwsm2010_Tumasjan-Predicting elections with Twitter.pdf”

  21. Not so fast, speedy… It seems that they forgot the party with the biggest tweet share…

  22. Maybe Google Trends can predict US Elections…

  23. Can Google Trends predict elections? 2008 US Congressional Elections Data Collection 2010 US Congressional Elections Data Collection The Competitors for Prediction:

  24. US congressional elections 2008 & 2010

  25. Prediction of All races(unfair to Google-trends)

  26. Prediction of races where one candidate had no G-trends visibility

  27. Prediction of races where both candidates had G-trends visibility

  28. What about the one success case?

  29. Conclusions Google Trends: bad predictor of election results Google Trends: Good Predictor of election defeat! But what about other Social Media? What do YOU think?

  30. High G-trends may be bad news! Liberal activists try again unsuccessfully in 2010 Liberal activists openly collaborate to Google-bomb search results of political opponents in 2006 Conservative activists launch a Tweeter-bomb in Jan. 2010

More Related