1 / 9

Avar

Avar. Health. Sentinel. Public. Dashboard. Health Alert Level. Team Mentor: Avaré Stewart. Monitoring the blogosphere for emerging, health related events, so Health Officials don‘t have to. Event-Based Biosurveillance.

pascal
Download Presentation

Avar

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Avar Health Sentinel Public Dashboard Health Alert Level Team Mentor: Avaré Stewart Monitoring the blogosphere for emerging, health related events, so Health Officials don‘t have to

  2. Event-Based Biosurveillance Monitor time series, textual data to provide early alerts to anomalies ... stimulate investigation of potential outbreaks Shmueli 2010

  3. What If ....? • Could we have • detected the • emergence of the • 2009 Swine Flu • Pandemic from • BlogSocial Media ?

  4. Questions to Consider? • What approach can be used to create SimulatedOutbreaks for blog text? • What outbreak patterns/signatures exist? • How can text be generated to simulate a given outbreak pattern? • What tools would be useful in creating Simulated Textual and Numerical Outbreak Data?

  5. Feature Selection and Counts for Outbreak Data Event-Based (Text) Data • Select (textual features) i.e.: Number documents containing mentions „flu“ • Get timestamps • Create counts from features • Time SeriesData • Select features: • i.e.: hospital visits, death rate • Get timestamps • Create counts from features

  6. What is the Task? • Using existing data, tools and references: • Part 1: Design an approach to creating Simulated Data from example data • Part 2: Design an approach for adding Noise to the Simulated Data • Part 3: List and summarize any additional tools and approaches that would be useful in creating Simulated Data • Document your design: • Discuss the motivation for your work/results • Outline algorithms with PseudoCode • Provide several example input and outputs • Hightlight strengths and shortcomings

  7. What Will You Learn? • Build Simulated Data from Event-Based Biosurveillance • How to organize, design, implement, deliver small-scale project results • That your contributions are valuable ....

  8. Starting Points • Papers : • The Nature of Outbreaks and Their Determination (Shmueli, Section 3.2 ) • Characteristic shapes of outbreak news. (Collier, Figure 5) • Model-Specific Generation: • Data Simulation Using R • http://biostat.mc.vanderbilt.edu/wiki/pub/Main/AngelAn/myslides5.pdf • Random Generation: • “Generating Random Text with Bigrams“ • http://nltk.googlecode.com/svn/trunk/doc/book/ch02.html • “How to Generate Random Text in Word 2003” http://www.ehow.com/how_2183058_generate-random-text-word.html

  9. Background Reading • Statistical Challenges Facing Early Outbreak Detection in Biosurveillance, Galit Shmueli • What‘s Unusual in Online Disease Outbreat News, Nigel Collier • Detecting Influenza Outbreaks by Analyting Twitter Messages, Aron Culotta

More Related