Introduction to baselines & significance for interpretation

Introduction to baselines & significance for interpretation Sheena Sullivan, MPH, PhD

Baselines and thresholds • Baseline • The usual or average level of influenza activity that occurs during a typical year • Threshold • The level of influenza activity that signals the occurrence of a specific activity • Seasonal threshold • The level of influenza activity that signals the start and end of the annual influenza season • Alert threshold • A level above which influenza activity is higher than most years.

Baselines and thresholds • Used as a point of reference to detect • Start/end of the season • Severity of season • Outbreaks • Useful to • inform public health actions • improve clinical diagnosis • stimulate diagnosis • encourage early prescription of antivirals • Indicate uptake and timeliness of vaccine • Used retrospectively and prospectively

Example: Victorian sentinel data worse outbreak start Tay et al (submitted)

Data considerations • Baselines are difficult to establish • Rely on very stable data collected over a long period of time • Multiple years of data are needed to account for fluctuations in activity from year to year • Severity of virus types • Changes in data collection methods, diagnosis coding, participating clinical facilities, insurance eligibility, testing practises • Most surveillance systems cannot adjust quickly to changing baselines, so that utilization shifts may trigger false alarms – public health crises and major public events may undermine health surveillance systems at the very times they are needed most

Data considerations • Five years of surveillance data collected consistently is considered standard for a reliable baseline • 3 years minimum • Several months may be sufficient with sophisticated modelling (e.g. Cowling 2006) • Difficult in tropical areas or areas with many different causes of ILI

Sources of data • ILI • E.g. number of consultations at sentinel GPs among all patients seen • E.g. proportion of total outpatient visits • SARI • E.g. hospital admissions • Deaths • E.g. number or rate of influenza or pneumonia deaths • Laboratory notifications • E.g. percentage of influenza-positive specimens among respiratory tests • Over the counter pharmaceutical sales • Call volume to information /advice lines • Frequency of online searches (e.g. Google flu trends, Ginsberg et al 2009)

Sources of data • Multiple sources of data can be combined to develop composite indicators of baseline and thresholds • E.g. start of season = week in which ILI crosses a certain value and the percentage of specimens testing positive reaches a certain value

Types of data • Count • May be used where denominators are not known, or cases are rare • Require very stable data • An average number of cases may be used where denominators are unknown • Proportion • Better parameter to use • Can correct for shifts both in numerator and denominator activity (e.g. health care utilization and in disease activity) • Help to clarify/magnify an outbreak signal

Example: counts v proportions ------- total number of visits ------- counts of ILI Burkom et al. 2008

Individual site vs. aggregated site • Baselines may be calculated for individual sites, to monitor activity in particular locations, or may use data compiled from all surveillance sites • In locations with few sites, individual baselines help control for non-reporting sites, regional variation within a country, or relative representativeness of sites of the national or regional populations • In locations with many sites, data aggregated from all sites might be the best way to set a baseline

Data represented • The type of threshold calculated depends on data available and usage • Indicate start of season • Comparison of this week’s value with the expected value • Outbreak indicator

Timeliness of data • Depends on needs • Early-warning systems require near-real-time data • If wanting to report what happened in a season, timeliness less important • Real-time • E.g. China • Weekly or fortnightly reports • E.g. US • Death certificates • May take months to verify

Methods for determining thresholds

Types of baselines Figure 2. ILI presentation rates at metropolitan and rural general practice sentinel sites, 1997 to 2012 The 2012 Victorian Influenza Vaccine Effectiveness Audit Report: http://www.victorianflusurveillance.com.au • Static (flat) • Does not change with changes in seasonal patterns found in data • Uses data for a specified period of time (may be for a whole year, may be for only period when surveillance is conducted) • Baseline - defines the start and end of an influenza season • Average and above average - describe the intensity of a season

Visual inspection Watts et al., 2003

Shewart charts Figure 1. Victorian weekly laboratory notifications of influenza 2002-2008 with Shewhart Chart threshold of 6.5. • Simplest control chart • Developed by industry for QA/QC • Assumes normal distribution • Binomial (Bernoulli) & Poisson variants • Mean(w) based on previous data • Control limits • Upper limit = w +kw • K either predetermined, usually 2 or 3,  upper limit of 95%CI • or set based on recent observations • Alert declared when obervations exceed this upper limit • i.e. |yt − µ| > kσ • False alarms • mean and sd should be estimated from large dataset to avoid false alarms Steiner et al 2010

Example: US ILI Baseline • Plots % of patient visits to healthcare providers for ILI reported each week weighted by state population and compared with a baseline • % visits = n(ILI) / N N=total patients for that week • y = % visits * w w=weight for the state • Baseline = mean % of patient visits for ILI during non-influenza weeks for the previous three seasons plus 2 standard deviations (+2) • Non-influenza week = two or more consecutive weeks in which each week accounted for less than 2% of the season’s total number of specimens that tested positive for influenza • National baseline = 2.2%, but each region has its own baseline also • Does not include summer data; do not know if activity is outside the norm for the summer months http://www.cdc.gov/flu/weekly/overview.htm#Outpatient

National Baseline Outbreak/Epidemic Activity

Types of baselines Figure. Rate of deaths classified as influenza and pneumonia from the NSW Registered Death Certificates, 1 January 2007 to 21 September 2012 • Unusual activity Australia Influenza Surveillance Report, Oct 2012 • Cyclical (seasonal) • Good for data with regular seasonality • May be inappropriate in regions with unclear seasons (e.g. tropics) • Cyclical changes in baseline reflect seasonal pattern of disease activity • Distinguishes disease related increases with normal seasonal increases in a syndrome (e.g. pneumonia) • Different methods used: • Regression models, moving averages, time series

Example: 122 Cities Mortality Baseline Weekly report of total death certificates & total for which pneumonia or influenza (P&I) was listed as underlying cause of death, by age group Percentage of deaths due to P&I are compared with seasonal baseline and epidemic threshold values calculated for each week Seasonal baseline is calculated using a periodic regression model that incorporates a robust regression procedure applied to data from the previous five years An increase of 1.645 standard deviations above the seasonal baseline is considered the “epidemic threshold”

Example: 122 Cities Mortality Baseline http://www.cdc.gov/flu/weekly/

Interpreting data: which method is best? • Depends on the application • Methods which use a static baseline may be better for defining the beginning/end of a season • Methods that rely on seasonality may be inappropriate for the tropics • Can formally evaluate based on: • Sensitivity – true alarm rate • Specificity – false alarm rate • Positive predictive value – ratio of true positive epidemic alarms over the total number of alerts • Timeliness – how quickly the method signals an outbreak (run-length)

Interpreting data • Knowing your data is key to interpreting • Time series data may show anomalies associated with holidays, long weekends, etc. • Interpretation of those anomalies is dependent on the data analyst’s knowledge of trends not accounted for in the detection algorithm • Baselines may be influenced by changes in • Data collection methods • Provider participation • Changes in case definitions • Changes in population/health care use

Example: understanding and interpreting data Ungchusak et al. 2012

Example: Categories of influenza season in Victoria for six surveillance datasets, 2002 - 2011 Tay et al (submitted)

Understanding source data • Lab-confirmed influenza • May not be available • Mortality data • Lag time between influenza circulation and death • Hospitalisation • Issues with coding • Lag time between infection and rise in presentations to hospital • Weekly versus daily data • Greater variation with daily reports

Example: understanding and interpreting data

Summary • Baselines and epidemic thresholds help to understand the significance of increased influenza activity • When to know a flu season has begun • When to know if a spike in activity is a real spike • An indicator that activity is unusual or outside the norm • An indicator for public health action • Graphical representation of current surveillance data compared with baseline data and previous years’ data provides a meaningful snapshot to public health practitioners, policy makers, and others regarding current activity

References • Burkom, et al. Developments in the Roles, Features, and Evaluation of Alerting Algorithms for Disease Outbreak Monitoring, Johns Hopkins APL Technical Digest, 2008 • Clothier HJ, et al. A comparison of data sources for the surveillance of seasonal and pandemic influenza in Victoria. CommunDisIntell. 2006;30(3):345-9. • Cooper DL, et al. Can syndromic thresholds provide early warning of national influenza outbreaks? J Public Health. 2009;31(1):17-25. • Cowling BJ, et al. Methods for monitoring influenza surveillance data. Int J Epidemiol. 2006;35(5):1314-21. • Dedman D, Watson J. The use of thresholds to describe levels of influenza activity. PHLS Microbiol Dig. 1997;14:206-8. • Ginsberg J, Mohebbi MH, Patel RS, Brammer L, Smolinski MS, Brilliant L. Detecting influenza epidemics using search engine query data. Nature. 2009 Feb 19;457(7232):1012-4. • Goldstein E, et al. Improving the estimation of influenza-related mortality over a seasonal baseline. Epidemiology. 2012;23(6):829-38. • Goldstein E, et al. Predicting the epidemic sizes of influenza A/H1N1, A/H3N2, and B: a statistical method. PLoS Med. 2011;8(7):e1001051. • Hashimoto S, et al. Detection of epidemics in their early stage through infectious disease surveillance. Int J Epidemiol. 2000;29(5):905-10. • Health Protection Agency (HPA). Surveillance of influenza and other respiratory viruses in the UK: 2010-2011. London: HPA; May 2011. Available from: http://www.hpa.org.uk/Publications/InfectiousDiseases/Influenza/1105influenzareport/ • Hutwagner LC, Maloney EK, Bean NH, Slutsker L, Martin SM: Using laboratory-based surveillance data for prevention: an algorithm for detecting salmonella outbreaks. Emerg Infect Dis 1997, 3:395–400. • Kelly HA, et al. The significance of increased influenza notifications during spring and summer of 2010-11 in Australia. Influenza Other Respi Viruses. 2012. • Kuang J, et al. Epidemic features affecting the performance of outbreak detection algorithms. BMC Public Health. 2012;12(1):418. • O'Brien SJ, Christie P. Do CuSums have a role in routine communicable disease surveillance? Public Health. 1997;111(4):255-8. • Serfling RE. Methods for current statistical analysis of excess pneumonia-influenza deaths. Public health reports 1963; 78:494-506. • Steiner SH, et al. Detecting the start of an influenza outbreak using exponentially weighted moving average charts. BMC Med Inform Decis. 2010;10. • Ungchusak et al. Lessons Learned from Influenza A(H1N1)pdm09 Pandemic Response in Thailand. EID 2012:18, • Vega T, et al. Influenza surveillance in Europe: establishing epidemic thresholds by the Moving Epidemic Method. Influenza Other Respi Viruses. 2012. & http://cran.r-project.org/web/packages/mem/mem.pdf • Watts CG, et al. Establishing thresholds for influenza surveillance in Victoria. Aust N Z J Public Health. 2003;27(4):409-12. • World Health Organization. WHO Interim Global Epidemiological Surveillance Standards for Influenzaa. Geneva: Global Influenza Programme, Surveillance and Monitoring team, World Health Organization, 2012. http://www.wpro.who.int/emerging_diseases/documents/docs/GuideforDesigningandConductingInfluenzaStudies.pdf

Exercise - Defining baseline curves and alert threshold Determining the baseline Align the transmission peaks of several years data around the median week of peak reporting Calculate an average weekly number for each week centred on the median peak week of transmission

Defining alert threshold Display the lowest and highest season, excluding exceptional events (e.g. pandemic)

Defining alert threshold Calculate the standard deviation of the mean for each week and create a curve based on those values

Examining data Plot the current year’s data on the curve

Introduction to baselines & significance for interpretation