620 likes | 886 Views
Intro to Statistics for Infection Preventionists. Presented By: Jennifer McCarty, MPH, CIC Shana O’Heron, MPH, PhD, CIC. Objectives. Describe the important role statistics play in infection prevention. Describe the most common types of statistics used in hospital epidemiology
E N D
Intro to Statistics for Infection Preventionists Presented By: Jennifer McCarty, MPH, CIC Shana O’Heron, MPH, PhD, CIC
Objectives • Describe the important role statistics play in infection prevention. • Describe the most common types of statistics used in hospital epidemiology • Provide examples on how statistics are utilized in hospital epidemiology.
Role of Statistics in Hospital Epidemiology • Aid in organizing and summarizing data • Population characteristics • Frequency distributions • Calculation of infection rates • Make inferences about data • Suggest association • Infer causality • Communicate findings • Prepare reports for committees • Monitor the impact of interventions
Role of Statistics in Hospital Epidemiology • When evaluating a study or white paper • Are the findings statistically significant? • Was the sample size large enough to show a difference if there is one? • Are the groups being compared truly similar? • When investigating and unusual cluster • Describe the outbreak • Select control subjects • Determine the appropriate test to use when measuring exposure
Descriptive Epidemiology • Descriptive Statistics: techniques concerned with the organization, presentation, and summarization of data. • Measures of central tendency • Measures of dispersion • Use of proportions, rates, ratios
Descriptive Statistics • Variable: “Anything that is measured or manipulated in a study” • Types of variables: • Qualitative • Nominal, Ordinal • Quantitative • Interval, Ratio • Independent vs. Dependent Variables • Continuous vs. Discrete variables
Measures of Central Tendency • Mean: mathematical average of the values in a data set. • Calculation: Patient Length of Stay: 12, 9, 3, 5, 7, 6, 13, 8, 4, 15, 6 Mean (x)= The sum of each patient’s length of stay The number of patients = 12 + 9 + 3 + 5 + 7 + 6 + 13 + 8 + 4 + 15 + 6 = 88 = 8 days 11 11
Measures of Central Tendency • Median: the value falling in the middle of the data set. • Calculation: Patient Length of Stay: 12, 9, 3, 5, 7, 6, 13, 8, 4, 15, 6 Median = 3, 4, 5, 6, 6, 7, 8, 9, 12, 13, 15 = 7 days
Measures of Central Tendency • Mode: the most frequently occurring value in a data set. • Calculation: Patient Length of Stay: 12, 9, 3, 5, 7, 6, 13, 8, 4, 15, 6 Mode = 3, 4, 5, 6, 6, 7, 8, 9, 12, 13, 15 = 6 days
Measures of Dispersion • Range: the difference between the smallest and largest values in a data set. • Calculation: Patient Length of Stay: 12, 9, 3, 5, 7, 6, 13, 8, 4, 15, 6 Range = 15 – 3 = 12 days
Measures of Dispersion • Standard Deviation: measure of dispersion that reflects the variability in values around the mean. • Deviation: the difference between an individual data point and the mean value for the data set. • SD = √(X-X)2 / n-1 • “Take all the deviations from the mean, square them, then divide their sum by the total number of observations minus one and take the square root of the resulting number” • Variance: a measure of variability that is equal to the square of the standard deviation.
Normal Distribution Continuous distribution Bell shaped curve Symmetric around the mean
Non-Normal Distributions • Skew • Non-symmetric distribution • Positive or Negative • Refers to the direction of the long tail • Bi/Multi-Modal • May have distinct peaks with its own central tendency • No central tendency
Use of Proportions, Rates & Ratios • Proportions: A fraction in which the numerator is part of the denominator. • Rates: A fraction in which the denominator involves a measure of time. • Ratios: A fraction in which there is not necessarily a relationship between the numerator and the denominator.
Proportions • Prevalence: proportion of persons with a particular disease within a given population at a given time.
Rates • Rate =x/y × k • x = The number of times the event (e.g., infections) has occurred during a specified time interval. • y = The population (e.g., number of patients at risk) from which those experiencing the event were derived during the same time interval. • k = A constant used to transform the result of division into a uniform quantity so that it can be compared with other, similar quantities.
Example: Foley-Associated UTIs in the ICU Step 1: Time period April 2014 Step 2: Patient population Patients in the Medical / Surgical ICU of Hospital X who have Foley catheters Step 3: Infections (numerator) April CAUTI infections in the ICU = 2 Step 4: Device-days (denominator) Total number of days that patients in the ICU had Foley catheters in place = 920 Step 5: Device-associated infection rate Rate = 2 x 1000 = 2.17 per 1000 Foley-days 920 Rates
Ratios • Calculation of Device Utilization Ratio • Step 1: Time period • April 2014 • Step 2: Patient population • Patients in the Medical / Surgical ICU of Hospital X who have Foley catheters • Step 3: Device-days (numerator) • Total number of days that patients in the ICU had Foley catheters in place = 920 • Step 4: Patient-days (denominator) • Total number of days that patients are in the ICU = 1176 • Step 5: Device utilization ratio • Ratio = 920= 0.78 1176
What does this tell you? • When examined together, the device-associated infection rate and device utilization ratio can be used to appropriately target preventative measures. • Consistently high rates and ratios may signify a problem and further investigation is suggested. • Potential overuse/improper use of device • Consistently low rates and ratios may suggest underreporting of infection or the infrequent use or short duration of use of devices.
Analytic Epidemiology • Inferential statistics: procedures used to make inferences about a population based on information from a sample of measurements from that population. • Z-test/T-test • Chi Square • SIR
Hypothesis Testing Studies • Null Hypothesis (Ho): a hypothesis of no association between two variables. • The hypothesis to be tested • Alternate Hypothesis (Ha): a hypothesis of association between two variables.
Hypothesis Testing: Error Decision
Significance Testing • A p value is not the probability that your finding is due to random chance alone • But of collecting a random sample of the same size from the same population that yields a result at least as extreme as the one you just calculated • Level of Significance ( level) is the probability of rejecting a null hypothesis when it is true • The level of risk a researcher is willing to take of being wrong • Usually set to 0.05 or 0.01
Hypothesis Testing: Error • Type I Error: Probability of rejecting the null hypothesis when the null hypothesis is true. • = probability of making a type I error • Type II Error: Probability of accepting the null hypothesis when the alternate hypothesis is true. • = probability of making a type II error • Power: Probability of correctly concluding that the outcomes differ • 1 - = power
Hypothesis Testing: Error Decision
Parametric Tests • Assume Normal distribution of the sample population • Usually continuous-interval variables • z Test • Student’s t Test
z Test • Test the difference in means of two proportions (two tailed) • Use when: • Sample size is greater than 30 • Requires a normal distribution • Example: Comparing your mean infection rate to NHSN mean rates
t Tests • http://www.dimensionresearch.com/resources/calculators/ztest.html
t Tests • Test the difference in means (one or two tailed) • Use when: • Sample size is less than 30 • Assumes • Independence of populations & values • Variance is equal for both sets of data • No confounding variables • Types of t Tests: • Independent sample (experiment vs. control) • Paired sample (before and after)
t Tests • http://www.dimensionresearch.com/resources/calculators/ttest.html
t Tests • http://www.usablestats.com/calcs/2samplet
Non-Parametric Tests • Do not assume normal distribution • Used with more types of data: • Nominal, Ordinal, Interval, Discrete (infection vs no infection) • Chi Square (X 2) • Compares observed values against expected values • Example: Comparing SSI rates for Dr. X and Dr. Y • http://www.gifted.uconn.edu/siegle/research/ChiSquare/chiexcel.htm
Chi square • http://faculty.vassar.edu/lowry/newcs.html
Relative Risk • Comparing the risk of disease in exposed individuals to individuals who were not exposed ___Disease incidence in exposed___ _a / (a + b)_ Disease incidence in non-exposed c / (c + d) __a__ ____a + b____ __c__ c + d RR = = ( ) RR = ( )
Relative Risk • RR = 1 • Risk in exposed equal to risk in non-exposed • No association • RR > 1 • Risk in exposed greater than risk in non-exposed • Positive association, possibly causal • RR < 1 • Risk in exposed less than risk in non-exposed • Negative association, possibly protective
Odds Ratio • Comparing the odds that a disease will develop __Odds that a case was exposed_ _a / c_ _ad_ Odds that a control was exposed b / d bc OR = = =
Odds Ratio • OR = 1 • Exposure not related to the disease • OR > 1 • Exposure positively related to disease • OR < 1 • Exposure negatively related to the disease
95% Confidence Interval • Confidence Interval: a computed interval of values that, with a given probability, contains the true value of the population parameter. • 95% CI: 95% of the time the true value falls within the interval given. • Allows you to assess variability of an estimated statistic • If the confidence interval includes the value of 1, then the stat is not significant
Standardized Infection Ratio (SIR) • Compare the HAI experience among one or more groups of patients to that of a standard population’s (e.g. NHSN) • Risk-adjusted summary measure • Available for CAUTI, CLABSI, and SSI data • Details can be found in the SIR Newsletter, available at: http://www.cdc.gov/nhsn/PDFs/Newsletters/NHSN_NL_OCT_2010SE_final.pdf
SIR • Observed # of HAI – the number of events that you enter into NHSN • Expected or predicted # of HAI – comes from national baseline data* • The formula for calculating the number of expected CLABSI infections is: • # central line days *(NHSN Rate/1000) *Source of national baseline data: NHSN Report, Am J Infect Control 2009;37:783-805 Available at: http://www.cdc.gov/nhsn/PDFs/dataStat/2009NHSNReport.PDF