200 likes | 332 Views
A Data Intensive High Performance Simulation & Visualization Framework for Disease Surveillance. Arif Ghafoor , David Ebert, Madiha Sahar Ross Maciejewski , Shehzad Afzal , Farrukh Arslan. Objective and Goals.
E N D
A Data Intensive High Performance Simulation & Visualization Framework for Disease Surveillance ArifGhafoor, David Ebert, MadihaSahar Ross Maciejewski, ShehzadAfzal, FarrukhArslan Acknowledgement: Project Partially Funded by Cyber Center
Objective and Goals Objective: To address the infectious disease surveillance challenges and develop a collaborative capability for all the stakeholders for monitoring and managing outbreaks infectious diseases in large cities Approach: Develop a high performance computing (HPC) framework employing robust and novel infectious disease epidemiology models with real-time inference and pre/exercise planning capabilities.
Objective and Goals Real-time data analysis capabilities, providing a model for infrastructure development where lessons learned can be used to develop best practice models A comparative assessment of disease modeling techniques by focusing on the tradeoff between the level of granularity used in creating the model and the model efficacy Novel visual analytics paradigms integrating decision support and resource allocation tools with live streaming data and disease simulation scenarios
Conceptual view of Proposed Infectious Disease Surveillance Framework 4
Tasks: Task A: Data Intensive Multi-Resolution Simulation Modeling Task B: High Performance Simulation Modeling on HADOOP
Task A: Initial Research Results • Challenge: The notion of context, is important for syndromic surveillance. For syndromic data set we need: • Contextual attributes • Behavioral attributes • We have proposed an HPC data mining framework for contextual and behavioral attributes using Syndrome Ontology (Assumption: Domain Knowledge is available) • Currently pursuing system Implementation --WEKA: Machine Learning & Data Mining in Java. (http://www.cs.waikato.ac.nz/ml/weka/index.html)
Task A: Data Intensive Multi-Resolution Simulation Modeling (initial results) • Proposed HPC framework for mining of contextual (eg. spatio-temporal) and behavioral attributes using Syndrome Ontology. • Domain knowledge is available through domain ontology 7
Ontological Syndromic and Climate Classifiers Exploration towards decision trees spanning over distributed multi-domains, representing semantic knowledge at temporal, spatial and socio-economic level. 8
CoCo Classifier
Developing Novel Statistical HeterogeneousAgent Based SIR Model Adding age based and gender based classification Demographic impacts on spread rate (socioeconomic classification) Capturing seasonal trends of disease spread Effect of decision making considering preventive measures (inoculation of population, resource allocation of healthcare) 11
Task B: High Performance Simulation Modeling on HADOOP (in progress) Objective: Development of agent-based and multi-granularity homogenous mixing model for HPC-based simulation.
TASK B: High Performance Simulation Modeling on HADOOP Development of Agent-Based SIR Model for Heterogeneous Networks Simulation Based Disease Spread Behavior Analysis of Decision making for Preventive Measures
SIR IN HETEROGENEOUS NETWORKS • Each node can have three states: Susceptible, Infected, and Recovered (S, I, R) • Once infected, a node can transmit infection to neighboring susceptible nodes with a probability β • Infectednodes stay infected for a duration d • Recovery rate of infected nodes υ is 1/d • Susceptibility of an individual may vary depending upon the number of infected neighbors • Within a group interaction: β: probability of getting disease during a contact d: duration of infection υ: Recovery Rate ( 1/d) N: Total Population Figure: State diagram of SIR Model
SOCIAL NETWORK MODELING FOR PREDICTION & MANAGEMENT OF EPIDEMICS • Development of an Agent Based social networking model to simulate the infectious disease spread • Population is divided into groups depending upon age, gender, occupation, and location – a phenomenon known as Assortative Mixing • Distribution of contacts play a key role in determining the onset of expansion phase of epidemic
HETEROGENEOUS GRAPH MODEL FOR MULTI-GROUP POPULATION INTERCATION
CURRENT STATUS • Development of Heterogeneous Models & evaluation of their fidelity. Simulation in NETLOGO • Simulation Objectives • Effect of demographic properties • Effect of weather on epidemic disease spread and seasonal trends • Effect of pharmaceutical and other decision measures on epidemic spread
Summary and Status Proposed an HPC-based data mining framework for contextual and behavioral attributes using Syndrome Ontology (Assumption: Domain Knowledge is available). Currently pursuing system Implementation --WEKA: Machine Learning & Data Mining in Java. Development of agent-based SIR heterogeneous population model for HPC-based simulation for large cities (in progress). Proposal (in preparation): Gates Foundation Grand Challenges Explorations for Global Health Potential collaboration with MSR