1 / 21

Tracking the Flu Pandemic by Monitoring the Social Web

Tracking the Flu Pandemic by Monitoring the Social Web. Vasileios Lampos and Nello Cristianini. Jedsada Chartree 04/11/11. Introduction. Growing interest in monitoring disease outbreaks. Growing of twitter users - February, 2010 50 million tweets/day

paxton
Download Presentation

Tracking the Flu Pandemic by Monitoring the Social Web

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Tracking the Flu Pandemic by Monitoring the Social Web VasileiosLampos and NelloCristianini Jedsada Chartree 04/11/11

  2. Introduction • Growing interest in monitoring disease outbreaks. • Growing of twitter users - February, 2010 50 million tweets/day - June, 2010 65 million tweets/day (750 tweets/s - 190 million users (Source: http://en.wikipedia.org/wiki/Twitter) - 5.5 million users in the UK (2009)

  3. Introduction • The National Statistics reports the flu delay of 1 to 2 weeks. • Twitter can reveal the situation up to date.

  4. Methodology • Data • 1. Official health reports from the Health Protection Agency (HPA), UK. • 2. Twitter, UK • - Daily average of 160,000 tweets • (24 weeks from 06/22/2009 to 12/06/2009) • - Twitter geolocation (geographical coordinates).

  5. Methodology • Data • Region A = Central England & Wales • Region B = South England • Region C = North England • Region D = England & Wales • Region E = Wales & Northern Ireland RCGP Qsur RCGP = Royal College of General Practitioners Qsur = Qsurveillance, University of Nottingham and Egton Medical Information Systems

  6. Methodology HPA Flu Rates Twitter Data Flu-Score Correlation Coefficient

  7. Methodology • Flu-Score K = Total number of markers n = Total number of tweets for one day i = [1, k] J = [1, n] M = A set of textual markers = {mi} T = Daily set of tweets = The flu-score of a tweet

  8. Results Flu rates from the Health Protection Agency (HPA)

  9. Results Twitter’s flu-scores for region A-E (week 26 to 49, 2009)

  10. Results Correlation coefficients between Twitter’s flu-score and HPA’s rates

  11. Results Twitter’s flu-score and HPA rates for region D (England&Wales)

  12. Methodology • Learning HPA’s flu rates from Twitter flu-score K = Total number of markers, n = Total number of tweets for one day i = [1, k], i = [1, n], M = A set of textual markers = {mi} T = Daily set of tweets, w = Weighted value

  13. Results Linear regression using the markers

  14. Methodology • Automatic extraction of ILI textual markers 1. Creating candidate markers from: - Encyclopedic reference - Informal references 2. Forming the flu-subscores with time series. - Ranking the weights by applying the LASSO method.

  15. Methodology LASSO T = shrinkage parameter Vector w = the spare solution W(ls) = the least squares estimates for regression problem

  16. Methodology Stemmed markers extracted by applying LASSO regionally

  17. Results Linear regression using the markers on the test sets after performing LASSO

  18. Methodology Stemmed markers extracted by applying LASSO on the aggregated data

  19. Conclusion • Tracking the flu outbreak in the UK using Twitter messages. • High correlation between the flu-score and the HPA flu rates, greater than 95%.

  20. Reference • V. Lampos and N. Cristianini. 2010. International workshop on Cognitive Information Processing. 6 pp.

More Related