1 / 17

Michael Wood June 2011 userweb.port.ac.uk/~woodm/presentations.htm

What on earth is a p value, a Process sigma, Cronbach’s alpha, the Black-Scholes formula, a Priority in AHP, or the Sunday Times score for Portsmouth University? On the interpretability of measurements based on mathematical models. Michael Wood June 2011

gunnar
Download Presentation

Michael Wood June 2011 userweb.port.ac.uk/~woodm/presentations.htm

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. What on earth is a p value, a Process sigma, Cronbach’s alpha, the Black-Scholes formula, a Priority in AHP, or the Sunday Times score for Portsmouth University? On the interpretability of measurements based on mathematical models. Michael Wood June 2011 http://userweb.port.ac.uk/~woodm/presentations.htm

  2. Management makes use of many measurements based on mathematical models, but these are often difficult to interpret sensibly. This talk will look at some examples of such measurements, and the consequences of the problems of their interpretation – including the employment of unnecessary academics to teach what should be obvious, and supporting the bad decisions which led to the recent financial crash. I will then discuss how these, and other, measurements could be redesigned to make them more useful and user-friendly.

  3. I’ll look at four examples: • Six sigma and the process sigma measurement • Null hypothesis significance tests and p values • University league tables • Risk measurements and the normal (Gaussian) distribution

  4. Four examples … with some imaginary dialogues between the expert and a naive user ...

  5. Process sigma – the measurement linked to the Six Sigma philosophy The process sigma for this process is 4.833 What on earth does this mean? It means there are 430 dpmo (defects per million opportunities). Use this Sigma calculator So why not just say 430 dpmo? Keep it simple! But this would be dumbing down. Life is difficult and we mustn’t join the modern trend of trying to make it easier. Why not?The complicated version adds nothing except confusing the uninitiated. (Similar comments apply to Cpk.) ... which must be a good thing!

  6. p values We’ve done a survey and found that women are more intelligent than men. p value is 0.004. What does the p value mean? It tells us how sure we can be about our results taking sampling error into account. 0.0002 is very small. Not very impressive! It’s a bit difficult to explain p values to someone like you, but smaller is better. Less than 5% mean you can be fairly sure women are cleverer than men, less than 1% is almost conclusive. Sounds like you’re trying to confuse me … Reverse measure of wrong thing, misinterpreted Statman bits. User friendly units - $/inch, etc.

  7. … p values I’m told that if the p value is 0.004 this means that we can be 99.8% confident that women really are more intelligent based on this data. Isn’t that a better way to put it? No, that’s a common misunderstanding ... you need to go on a course, although I’m not sure you’ll take it in ... There are lots of common misunderstandings, but I’m sure about the 99.8% confident ...

  8. University League tables The Sunday Times score for Portsmouth University is 599. What does that mean? Well … e.g. Southampton got 783 points so Southampton is obviously a better place to study What are the points based on? Lots of things: e.g. Student satisfaction, Research quality So do Southampton do better on these two? ...

  9. ... University League tables Actually Portsmouth do a little better on student satisfaction (174 vs 169/250), but Southampton do better on research quality (136 vs 112/200) But student satisfaction is more important to students than research quality ... You’ve got to balance the two. The experts at the Sunday Times have done this. But different people may want different things ...

  10. Measurements of risk Muddled Michael has a habit of losing his car keys when he goes on holiday. He reckons he has a 25% chance of losing his keys. He decides to consult an expert on risk … Easy! If he takes 9 spare keys with him, then the probability of losing all 10 keys is 0.2510 which is about one chance in a million … which seems an acceptable risk. Michael puts all 10 keys on the same key ring (he doesn’t want to confuse himself by putting them in different places) and goes on holiday. The problem here is that the maths assumes that losing each key is an independent event. In fact if he loses one key he will probably lose the rest as well, so a more realistic estimate of losing all his keys is 25%! There are similar assumptions underlying mostrisk calculations – but if the calculations are more complicated it is easy not to notice.

  11. Risk and the weather • The probability of more than 1 mm of rain falling in Southampton in one day is 31.5% (Estimated from Met Office graph based on 1971-2000 data.) • Then, theoretically, the probability of a week when it rains every day is 0.3157 which suggests that this happens about every 9 years. • Two weeks with rain every day is a “once in 29000 years” event. • Almost certainly happens more often – last time was 20-30 November 2009, and the time before was 10-16 of the same month (Southampton Weather website) • The theory is wrong because the assumptions are wrong!

  12. Risk and the normal distribution Very similar assumptions underlie the normal (Gaussian) distribution. This assumes that the variable depends on a large number of small independent factors. If not the predictions can be misleading especially for rare events Many finance measurements depend on the normal distribution and similar assumptions – e.g. Black Scholes formula. OK in normal times, but tends to seriously underestimate the probability of big falls. If the Dow Jones Industrial average moved in accordance with a normal distribution, it would have moved by 4.5% or more on only six days between 1996 and 2003 …. In reality … 366 times” (Mandelbrot cited by Buckley, 2011, p. 140). Black Monday (1987) was a 20 sd event, once in a million year event, experienced several times by people much young than a million years (Buckley, 2011, 141). Measures “understood” but not assumptions … trust in a misunderstood version …

  13. What can go wrong? Unnecessary time and effort expended E.g. 50% of time spent on stats courses could be saved by redesigning concepts? Big savings in time and effort possible! Failure to understand Complete Subtleties Misunderstanding Of basic concept Of assumptions leading to misleading uses

  14. ... for example ... • P values • Massive amount of wasted time and energy (think of all those journal articles), general confusion, misinterpretations like significant=important • University league tables • scores taken too seriously, specific requirements ignored, creates uniformity because everyone thinks the same; rational world would be more varied • Risk • ignoring unrealistic assumptions led to over-confidence in mathematical measures which helped the financial crash ...

  15. Principles for designing measurements for understanding Remember most measurements determined by historical accident – therefore can probably be improved for current users and uses. Design not discovery. Name should reflect meaning of result, not the method used to get there Make sure the direction is intuitive, use units and percentages as appropriate Must be an accurate description of meaning of measurement in users’ language Users must understand key assumptions (which are not irrelevant technicalities). If possible users should follow general idea of derivation.

  16. Reasons for the persistence of strange measurements Aim often ticking a box, not understanding Users don’t see problem Interests of experts and teachers Mystification is good for business! Some measurements (e.g. process sigma) invented solely for this purpose? The dumbing down myth Increased user-friendliness should lead to more, not less, powerful use of measurements We need to dumb up so that even the dumb won’t do dumb things

  17. References • Buckley, Adrian (2011). Financial Crisis: causes, context and consequences. Harlow: Pearson Education. • I Six Sigma (2011). Sigma calculator available at http://www.isixsigma.com • Met Office graph • Southampton Weather website

More Related