240 likes | 393 Views
Efficient Implementation of Complex Interventions in Large Scale Epidemic Simulations. Jiangzhuo Chen. Joint work with Yifei Ma, Keith Bisset, Suruchi Deodhar, and Madhav Marathe. Winter Simulation Conference December 14, 2011. Talk Outline. Background
E N D
Efficient Implementation of Complex Interventions in Large Scale Epidemic Simulations Jiangzhuo Chen Joint work with Yifei Ma, Keith Bisset, Suruchi Deodhar, and Madhav Marathe Winter Simulation Conference December 14, 2011 Network Dynamics & Simulation Science Laboratory
Talk Outline • Background • Interventions in large scale epidemic simulations • Indemics simulation framework • Productivity Enhancement with Indemics • Efficient intervention implementation • Comparison results from real experiences • Performance Modeling and Prediction • Methodology: explained by examples • Experiment results • Summary
Large Scale Agent-Based Epidemic Simulation • Disease diffusion in a population (millions of agents) through agent-agent contacts (billions) • Real world intervention policy to epidemics can be very complex and difficult to predefine/code. • Many possible interventions with multiple configurable parameters: large (factorial) simulation design Ideally we would like: • Fast simulation • Capability to represent complicated realistic interventions • Appropriate experiment design for given study with a deadline
Complex Interventions • Vaccinate randomly chosen people • Vaccinate people with high degrees in contact network • Keep all school age children home for 2 weeks • Each county decides to close its schools if number of diagnosed students in the county exceeds threshold; students from closed schools stay home. • Same as above, plus for each student that stays home, if age<12 then a guardian must stay home too not so realistic more realistic
Indemics: System Architecture Indemics database running on a data server Indemics Server, running on head node of HPC Indemics Server Semi-structured database Temporal database Relational database New Interventions Queries & Interventions New epidemic dynamics Indemics Adapter Indemics Adapter Interactive Client Batch Client HPC Epidemic Simulator (e.g. EpiFast) Indemics web-interface client on PC Analyst sees only this module
Indemics: System Architecture Indemics database running on a data server Indemics Server, running on head node of HPC Indemics Server Semi-structured database Temporal database Relational database New Interventions Queries & Interventions New epidemic dynamics Indemics Adapter Indemics Adapter Interactive Client Batch Client HPC Epidemic Simulator (e.g. EpiFast) Indemics web-interface client on PC Analyst sees only this module
Indemics: System Architecture Indemics database running on a data server Indemics Server, running on head node of HPC Indemics Server Semi-structured database Temporal database Relational database New Interventions Queries & Interventions New epidemic dynamics Indemics Adapter Indemics Adapter Interactive Client Batch Client HPC Epidemic Simulator (e.g. EpiFast) Indemics web-interface client on PC Analyst sees only this module
Indemics: System Architecture Indemics database running on a data server Indemics Server, running on head node of HPC Indemics Server Semi-structured database Temporal database Relational database New Interventions Queries & Interventions New epidemic dynamics Indemics Adapter Indemics Adapter Interactive Client Batch Client HPC Epidemic Simulator (e.g. EpiFast) Indemics web-interface client on PC Analyst sees only this module
Epidemic Intervention Implementation EpiFast diffusion code (C++) intervention code (C++) Indemics EpiFast Intervention script diffusion code (C++) DBMS framework code (Java)
Scenario 1: Benefit of Indemics • EpiFast is a fast epidemic simulation tool in our lab • It can represent intervention in the form: • if predefined global conditions and local conditions are satisfied for a predefined set of nodes, then change node properties and/or labels of edges incident on them • (a) antiviral prophylaxis to randomly chosen people • (b) keep all primary school students home • It took too much coding effort (weeks) to implement • (a’) antiviral treatment to sick people • (b’) keep all primary school students home and let their guardians stay home too • With Indemics, it took only hours to script (a’) or (b’)
Performance and Productivity • We have been concerned about performance of HPC simulation tools • Human effort starts to be the bottleneck • Understand, implement, and verify new intervention strategies designed by epidemiologists • Set up simulations; run simulations • Post simulation analysis • Indemics: improve human productivity while maintaining simulation performance • Development cost reduction
Compare Different Ways to Implement Interventions • Development cost • Triggered intervention: when fraction of diagnosed school-age children exceeds 20% close all schools • Targeted intervention: treat diagnosed school-age children with antiviral • School (block) intervention: vaccinate all people in any school (census block) if over 5% in that school (block) are diagnosed
Scenario 2: Motivation for Performance Modeling • Epidemiologist in our lab wanted to run a large experiment (factorial design) with complex interventions • Simulation results needed in a week • Decided to use Indemics. But could simulation finish in time? • We applied performance model and predicted two weeks running time • Epidemiologist revised experiment design (cut half) • Simulation done in one week
Example of Indemics Intervention Script School intervention: provide vaccines to all students in any school where more than 5% students are sick initialization; define School_Trigger as SCHOOL_DIAGNOSED_TOTAL.persons > 0.05 * SCHOOL_INTERVENED.size; reset table SCHOOL_INTERVENED intervened_day = NULL; for Day = 1 to 10 do count new_diagnosed each school save_to SCHOOL_DIAGNOSED_TODAY; count accum_diagnosed each school save_to SCHOOL_DIAGNOSED_TOTAL; set SCHOOL_INTERVENED intervened_day = Day if intervened_day = NULL and School_Trigger = true; apply Vaccination to school in SCHOOL_INTERVENED where intervened_day = Day; done
Translated into SQL initialization; define School_Trigger as SCHOOL_DIAGNOSED_TOTAL.persons > 0.05 * SCHOOL_INTERVENED.size; reset table SCHOOL_INTERVENED intervened_day = NULL; update SCHOOL_INTERVENED set intervened_day = -1; for Day = 1 to 10 do count new_diagnosed each school save_to SCHOOL_DIAGNOSED_TODAY; insert into SCHOOL_DIAGNOSED_TODAY select school, count(pid) as persons, Day as diag_time from STUDENT, DIAGNOSED where pid = diagnosed_pid and diagnosed_time = Day; count accum_diagnosed each school save_to SCHOOL_DIAGNOSED_TOTAL; set SCHOOL_INTERVENED intervened_day= Day if intervened_day= NULL and School_Trigger = true; apply Vaccination to school in SCHOOL_INTERVENED where intervened_day= Day; select pid from STUDENT, SCHOOL_INTERVENED where STUDENT.school = SCHOOL_INTERVENED.school and intervened_day = Day; done
Performance Prediction • Prepare SQL atom statement performance lookup table • Given Indemics intervention script (S) for a study • Generate corresponding SQL query statements (Q1, Q2, …, Qi) • Decompose query statements into atoms • Estimate configurations of the atoms (size of table and result) • Look up atom running time AP(a) • For each query compute query time QP(Q)=sum of AP(a) • Compute script running time SP(S) = sum of query time QP(Q) Predicted running time is only a rough estimate.
Atomic SQL Statements Examples of basic SQL statement in Indemics script
Example of Performance Lookup Table • Data collected for Oracle 10g on server with 4 quad-core 2.4GHz Xeon processors and 64GB memory
Summary • Indemics is a database-driven high-performance high-productivity epidemic simulation framework • It enables realistic representation and efficient implementation of complex intervention strategies • We provide performance modeling for predicting simulation running time before running it.