280 likes | 510 Views
A Brief Introduction to Epidemiology - XIII (Critiquing the Research: Statistical Considerations). Betty C. Jung, RN, MPH, CHES . Learning/Performance Objectives. Quick review Basics of inferential statistics Common measures of association To be able to statistically critique studies
E N D
A Brief Introduction to Epidemiology - XIII(Critiquing the Research: Statistical Considerations) Betty C. Jung, RN, MPH, CHES
Learning/Performance Objectives • Quick review • Basics of inferential statistics • Common measures of association • To be able to statistically critique studies • Statistical Caveats • Statistical Issues • Statistical Rules of Thumb
Introduction • Refresh your memory • Basics of inferential statistics • Common measures of associations used in epidemiologic studies
Measures of Association &Hypothesis Testing Test Statistic = Observed Association - Expected Association Standard Error of the Association • Type I Error: Concluding there is an association when one does not exist • Type II Error: Concluding there is no association when one does exist
Measures of Association • Two Main Types of Measures • Difference Measures (Two Independent Means, Two Independent Proportions, The Attributable Risk) • Ratio Measures (Relative Risk, Relative Prevalence, Odds Ratio)
Measures of Association:Difference Measures • Two Independent Means • Two Independent Proportions • The Attributable Risk
Attributable Risk (AR) • The difference between 2 proportions • Quantifies the number of occurrences of a health outcome that is due to, or can be attributed to, the exposure or risk factor • Used to assess the impact of eliminating a risk factor
Measures of Association:Ratio Measures • Relative Risk (RR) • Relative Prevalence (RP) • Odds Ratio (OR)
Strength of Association Relative Risk;(Prevalence); Odds Ratio Strength of Association 0.83-1.00 1.0-1.2 None 0.67-0.83 1.2-1.5 Weak 0.33-0.67 1.5-3.0 Moderate 0.10-0.33 3.0-10.00 Strong <0.01 >10.0 Approaching Infinity Source: Handler,A, Rosenberg,D., Monahan, C., Kennelly, J. (1998) Analytic Methods in Maternal and Child Health. p. 69.
Caveats about Classifying Data • All persons in an epidemiologic study must be classifiable • All study reports should clearly state criteria used for classifying variables • Studies that use different criteria for defining the presence of any health state are not comparable with respect to reported rates of that health state
Caveats about Quantitative & Categorical Variables • Information on variability between persons is lost when quantitative data are categorized • Collapsing a quantitative variable into a categorical variable with two or more categories may obscure the fact that the underlying variable has a much larger range in one category than in another category
Caveats about Quantitative & Categorical Variables (Continued) • Be careful about comparing ranges because a larger sample will generally have a larger range • Collapsing quantitative variables into categories limits the choices of appropriate statistical tests of significance • Try using commonly used categories (as five- or ten-year age bands) to facilitate comparisons across studies
Berkson’s Fallacy • Associations based on hospital or clinic data are influenced by differential admission rates among groups of people • Similar source of selection bias occur when associations are based on autopsy data
Caveats about P-Values • The size of the p-value has no relationship to the potential practical significance of the findings • The P-value reveals nothing about the magnitude of effect (i.e., how much one group differs from another), or the precision of measurement (i.e., the amount of random error) • The nature of the sample, not the p-value, will determine whether inferences to the population of interest can be made (and the sample must be representative of the population)
Confidence Interval Estimation • Uses the sample mean to construct an interval (range) of numbers to estimate the effect • Provides some indication of how probable it is (e.g., 68%, 90%, 95%), or how “confident” one can be, that the true mean lies within the range of numbers in the interval estimate
Greenhalgh’s Questions to Ask About the Analysis (A) • Have the authors set the scene correctly? • Have they determined whether their groups are comparable, and, if necessary, adjusted for baseline differences? • What sort of data have they got, and have they used appropriate statistical tests?
Greenhalgh’s Questions to Ask About the Analysis (B) • If the authors have used obscure statistical tests, why have they done so and have they referenced them? • Are the data analyzed according to the original protocol? • Were paired tests performed on paired data?
Greenhalgh’s Questions to Ask About the Analysis (C) • Was a two-tailed test performed whenever the effect of an intervention could conceivably be a negative one? • Were “outliers” analyzed with both common sense and appropriate statistical adjustments? • Have assumptions been made about the nature and direction of causality?
Greenhalgh’s Questions to Ask About the Analysis (D) • Have “P values” been calculated and interpreted appropriately? • Have confidence intervals been calculated, and do the authors’ conclusions reflect them? • Have the authors expressed the effects of an intervention in terms of the likely benefit or harm which an individual patient can expect?
Statistical Issues:Epidemiological Studies • Logistic regression for binary outcomes • Cox regression for survival analysis • Poisson distribution for disease incidence or prevalence • Odds ratio approximates relative risk when disease is rare
Statistical Issues: Environmental Studies • Good statistical models are hard to come by • Publication bias can exaggerate excess risk • Odds ratios less than two (or greater than 0.5) can be interesting
Statistical Issues:Environmental Studies • What is the statistical basis for the environmental standard? • Variability vs. uncertainty • What’s the quality of the metadata • Biomarkers as surrogates for clinical outcomes
Statistical Issues:Risk Assessment • Hazard identification • Dose-response evaluation • Exposure assessment • Risk characterization • Risk management
Statistical Rules of Thumb • Use a logarithmic formulation to calculate sample size for cohort studies • Use no more than 4 or 5 controls per case for case-control studies • Obtain at least 10 subjects for every variable investigated for logistic regression
Statistical Rules of Thumb • Increase sample size in proportion to dropout rate. If dropout rate is expected to be 20%, then increase n/0.80 • If dropout is greater than 20%, review reasons for dropouts • Accept substitutes with caution
Statistical Rules of Thumb • Choosing cutoff points • Do not dichotomize unless absolutely necessary • Select an additive or multiplicative model according to: theoretical justification, practical application, and computer implication
References • For Internet Resources on the topics covered in this lecture, check out my Web site: http://www.bettycjung.net/ • Other lectures in this series: http://www.bettycjung.net/Bite.htm