Problems with statistical methods in management research … and a few solutions

Problems with statistical methods in management research … and a few solutions Presentation for the INFORMS Annual Meeting, Austin, USA, November 2010 Michael Wood, Portsmouth University, UK michael.wood@port.ac.uk http://userweb.port.ac.uk/~woodm/presentations.htm Presentation based on revised version of Wood (2010)

A case study of a typical journal paper leads to three conclusions: • The value of any statistical approach is seriously limited by various factors: e.g. difficulties of generalizing to contexts other than the sample studied. • The conventional hypothesis testing format makes results almost meaningless: instead, I suggest using confidence levels for hypotheses – and suggest two ways of doing this (one a bootstrap method on a spreadsheet). • The analysis should be far more user-friendly. I’ll start with 3, then 2. Probably no time for 1.

Case study: Glebbeek and Bax (2004) • ...wanted to test the hypothesis “that employee turnover and firm performance have an inverted U-shaped relationship: overly high or low turnover is harmful.” • To do this, they analyzed the performance (profitability) of “110 offices of a temporary employment agency” by building regression models, using “net result per office” (p. 281) as the measure of performance, and staff turnover, and the square of turnover as independent variables, as well as three control variables.

Part of Table 2 in Glebbeek and Bax (2004)

Could easily be more user friendly: Results and curvilinear predictions for Region 1 and mean absenteeism and age (Model 4)

Or ... Equivalent of Model 4 …

Null hypothesis testing and p values are not a goodidea • Many people don’t know what they mean or misinterpret them • Don’t tell you how big the effect is, or how likely that the curvilinear hypothesis is right • Inverted U-shape hypothesis too obvious to be worth proving? Instead … use confidence intervals, but only possibly for a single parameter, or …

Use bootstrapping:bold line is data prediction, other lines are predictions from resamples representing other samples from same source.

Obviously some of these resamples don’t confirm the inverse U-shape hypothesis, some do • Spreadsheet shows that 65% of 10 000 resamples produce inverted U-shape predictions • So confidence in hypothesis is 65% … which is what we want to know! • Notice that the two p values (both > 10%) don’t give you this information • See Wood (2009b). Analysis uses this spreadsheet: http://userweb.port.ac.uk/~woodm/BRQ.xls

Sometimes possible to use p values to work out confidence levels • P value for Turnover in (linear) Model 3 is 0.7%, the unstandardized reg. coefficient is –1 778 • So confidence that coefficient negative =99.65% … again avoids null hypotheses (Wood (2009a)

Consider whether any statistical approach is likely to be useful … The advantage of statistical methods is that they enable us to peer through the fog of noise variables to see patterns. But for a statistical analysis to be worthwhile it is necessary to check four issues • Whether the necessity to focus on easily measurable variables damages the credibility of the results • Whether the target population is likely to be of lasting interest • Whether the amount of variation explained is likely to be sufficient to justify the effort, and, taking these factors into consideration, • Whether the research makes a useful addition to existing knowledge (including “common sense”).

Conclusions • Consider usefulness of any statistical approach • Avoiding a series of hypotheses to test. Instead, look at size / nature of effects • Use confidence intervals not p values • Or confidence levels for hypotheses • derived from p values • derived from bootstrapping • Make results as user-friendly as possible

References • Glebbeek, A. C., & Bax, E. H. (2004). Is high employee turnover really harmful? An empirical test using company records. Academy of Management Journal, 47(2), 277-286. • Wood, M. (2009a). Liberating research from null hypotheses: confidence levels for substantive hypotheses instead of p values. http://arxiv.org/abs/0912.3878. • Wood, M. (2009b). Bootstrapping confidence levels for hypotheses about regression models.http://arxiv.org/abs/0912.3880. • Wood, M. (2010). The use of statistical methods in management research: suggestions from a case study. http://arxiv.org/abs/0908.0067v2

Thank you • Any thoughts? • All comments gratefully received (michael.wood@port.ac.uk )

Problems with statistical methods in management research … and a few solutions