280 likes | 294 Views
Learn about differentiating statistical significance from substantive importance in research, key criteria for assessing significance, and the limitations of inferential statistics. Get insights into interpreting multivariate results.
E N D
Differentiating between statistical significance and substantive importance Jane E. Miller, PhD
Overview • Substantive significance defined • Quick review of statistics • What questions can they answer? • What questions can’t they answer? • How to implement a balanced presentation of multivariate results. Both • Statistical significance • Substantive importance
Objective of most research papers • Few people who write about multivariate analysis are focused solely on statistical mechanics such as developing new computer algorithms or formal statistical tests. • Some statisticians and methodologists will have those interests. • Most of us are interested in studying some relationship among social science or health concepts. • Test a hypothesis, derived from theory or previous empirical studies. • Inferential statistics are a necessary tool for hypothesis testing in quantitative research.
What is substantive significance? • Substantive significance of an association between two variables. • “So what?” • “How much does it matter?” • Real-world relevance to topic • In various disciplines, substantive significance = • “clinically… • “economically… • “educationally… • …meaningful” variation.
Example: Body Mass Index & mortality • Body mass index (BMI) shows a statistically significant positive association with mortality. • But is that gradient substantively significant? • Is it worth designing an intervention to decrease BMI as a way of decreasing mortality?
Key criteria for assessing substantive significance • Is the association causal? • Will changing the hypothesized cause lead to change in the purported effect? • Will weight loss (reduced BMI) yield lower mortality? • Is the effect big enough to matter? • Is the excess mortality among overweight or obese persons large enough to justify a program? • Can the hypothesized cause be changed? • Is BMI malleable?
Example prose • “For every hour a boy played a video game, he read just two minutes less than a boy who didn’t play video games. Notably, non-gaming boys didn’t read much at all either, spending only eight minutes a day with a book.” • From a NYT summary of Cummings and Vandewater, 2007.“Relation of Adolescent Video Game Play to Time Spent in Other Activities,” Archives of Pediatrics and Adolescent Medicine.
Start with a hypothesis • The authors hypothesized that the more time adolescents spent on video games, the less time they spent on homework. • So far, description is purely in terms of the concepts under study. • No statistical jargon, yet… • To formalize this for statistical testing • Homework time = dependent variable (Y) • Gaming time = independent variable (Xi) • Ha= gaming time is negatively associated with homework time. • In other words, Xi is inversely associated with Y
Contrast it against the null hypothesis • The assumption of “no difference between groups” is called the null hypothesis (H0). • In the study on effects of gaming on homework time H0: time among gamers = time among non-gamers OR time among gamers - time among non-gamers = 0 • In words, the null hypothesis states that there is no difference in the amount of time spent on homework by gamers versus non-gamers.
What ? does inferential statistics answer? • “How likely would it be to obtain a difference at least as large as that observed between groups in the sample if in fact there is no difference between groups in the population?” • The p-value tells us the probability of falsely rejecting the null hypothesis. • Conventional levels of “statistical significance” : p<.05 • Strictly speaking, p<.05 tells us that for a large sample such as that used in the gaming study (N~1,400), the estimated effect size on time spent gaming is at least 1.96 times its standard error.
What questions DOESN’T it answer? • Whether the relationship is • Causal • Association ≠ causation • In the expected direction • The difference could be statistically significant but in the opposite of the hypothesized direction. • Big enough to matter in the real-world context • Each hour spent gaming reduced reading time by 2 minutes. Is that enough to induce genuine concern from parents or teachers? • Malleable
Conclusion: Don’t stop at “p<.05”! • “p<.05” answers only part of what we want to know about our research question. • It is a necessary but not sufficient part of statistical analysis. • Also need to consider questions about: • Substantive significance • Direction • Size • Causality • Non-causal associations should not be used to inform policy or program changes. • Confounding or spurious associations should be ruled out. • Often why a multivariate model is estimated.
Substantive significance overlooked • Many statistics textbooks show how to assess and present statistical significance. • Few if any show how to assess and present substantive significance.
Balance presentation of statistical and substantive significance • How to include both: • Inferential statistics for formal hypothesis testing. • Interpretation of substantive significance of findings in the context of the specific research question. • Critical for policy-makers and others not formally trained in statistics.
Principles for presenting results • Name the specific variables. Avoid • Writing about “my dependent variable” or “the effect size.” • Using acronyms from your database • Report numbers in tables. • Interpretnumbers in text. • Incorporate units and categories for variables into the prose description.
What to report when comparing numbers • Direction (AKA “sign”) • For categorical independent variables (IV), which category has higher value of the dependent variable (DV)? • For continuous IVs, is the trend in the DV up, down, or level? • Magnitude • How big is the difference in the DV across values of the IV? • Statistical significance
Gender as a predictor of birth weight • Poor: “Boys weigh significantly more at birth than girls.” • Concepts and direction but not magnitude. • Statistical significance is ambiguous: Is the term “significant” intended in the statisticalsense or to describe a large difference? • Slightly better: “Gender is associated with a difference of 116.1 grams in birth weight (p<.01).” • Concepts, magnitude, and statistical significance but not direction: Was birth weight higher for boys or for girls? • Best: “At birth, boys weigh on average 116 grams more than girls (p<.01).” • Concepts, reference category, direction, magnitude, and statistical significance.
Substantive significance in the discussion • Place findings back in the broader perspective of the original research question. • Do they correspond to your hypothesis in terms of • Direction (sign) of the effect? • Size? • Whether the effect size was attenuated when potential confounders or mediators were taken into account? • What is the evidence for a causal relationship? • If not causal, what explains the association? • If causal, what are the implications for policy, programs, etc.?
A substantive issue from gaming study • “But the meaning of the finding [that girls who are gamers spend less time than non-gamers on homework] is not clear, as high-academic achievers often spend less time on homework as well.” • Places the finding in broader context by discussing other correlates of homework time.
Another substantive issue • “Although only a small % of girls played video games, our findings suggest that gaming may have different social implications for boys than for girls.” • Raises the question of selection effects: which girls play video games, and do their other characteristics affect how they spend their time?
Relate findings to previous studies’ • Are your findings consistent with the published literature on the subject in terms of statistical significance, sign, and approximate size? • If not, why not? • Different sample (place, time, subgroup) • Different data source or study design • Different model specification • Included potential confounders not previously analyzed. • Tested for possible mediating effects of 1+ factors.
Statistical significance in the discussion • Describe in words, not numbers. • No detailed standard errors, p-values, or test statistics in the discussion section. • Focus on the purpose of the statistical tests • Did the main variable of interest increase proportion of variance explained by the model? • Did some other variable “explain” the association between your key variable and the outcome?
Summary • Emphasize the substantive issues behind the statistical analyses. • Design the specification to match topic and data. • Choose plausible, relevant numeric contrasts. • Aim for a balanced presentation of statistical significance and substantive importance. • Use prose to ask and answer research question. • Use tables to report comprehensive, detailed statistics. • Use charts if needed to convey complex patterns.
Suggested resources • Chapter 3 (Statistical significance, substantive significance, and causality) in • Miller, J.E., 2015. The Chicago Guide to Writing about Numbers, 2nd Edition • Miller J.E. and Y.V. Rodgers, 2008. “Economic Importance and Statistical Significance: Guidelines for Communicating Empirical Research.” Feminist Economics. 14(2):117-149.
Suggested online resources • Podcast on comparing two numbers or series
Suggested practice exercises • Study guide to The Chicago Guide to Writing about Numbers, 2nd Edition. • Questions #2 and #4 from the problem set for chapter 3 • Suggested course extensions for chapter 3 • “Reviewing” exercises #1–3 • “Writing and revising” exercises #1–3
Contact information Jane E. Miller, PhD jmiller@ifh.rutgers.edu Online materials available at http://press.uchicago.edu/books/miller/numbers/index.html