580 likes | 688 Views
STORIES AND STATISTICS. Prepared by Frank Swain National Coordinator for Science Training for Journalists Royal Statistical Society f.swain@rss.org.uk 020 7614 3947. Contents. Communicating numbers Percentages & percentage points Surveys Averages Uncertainty Trends
E N D
Prepared by Frank Swain National Coordinator for Science Training for Journalists Royal Statistical Societyf.swain@rss.org.uk 020 7614 3947
Contents Communicating numbers Percentages & percentage points Surveys Averages Uncertainty Trends Correlation versus causation Probabilities: what makes a value unusual? Absolute and relative risk Imagery
#1 Communicating numbers
Breaking down big numbers Your numbers are characters in the story – give them some personality
Breaking down big numbers 1.4m photos x 86,400 seconds in a day ÷ 500 million users = 240 photos per person per day Realistic? “1.4 million photos are uploaded a second”
Putting numbers in context Numbers often need to be scaled to be meaningful e.g. per person, per passenger mile etc. Tourist info centres Hospitals
Putting numbers in context “The implant has been used by around 1.4 million women since it was introduced in 1999. In its 11 years of use, medicine regulators have recorded 584 pregnancies among users” “…for every 1,000 women using it, less than one will get pregnant over a three-year period”
Percentages Percentages less than 1% are difficult to interpret. Better to use “3 in every 10,000” than 0.03% Also be careful with percentages bigger than 100% - can be better to use double, triple etc.
Percentages Know the difference between a percentage and a percentage point. VAT increased to 20% on January 2011 This is a rise of 2.5 percentage points not a rise of 2.5%
UK smoking rate 1970 1948 26m smokers 65% 55% 25m smokers = 1 million non-smokers “The smoking population shrank by 4 per cent” = 1 million smokers “The smoking rate has declined 10 percentage points”
#2 Surveys
What’s been counted? chairs? • How many… footprints? hearts beating? ballot papers? …people?
Polls and surveys Polls are ways of finding out what a population thinks without asking everyone Sample size – poll of 1000 people has ± 3% confidence interval just from sampling So be careful of small subgroups of the sample, 100 people gives ± 10%
Survey example “…couples now expect to blow an average of £20,273 tying the knot…” • Which average? • Whose wedding? • Who’s asking?
#3 Do you have the exact questions the pollster asked? Are they precise and fair?
Polls and surveys Do the people surveyed reflect the wider population? (selection bias) Were the questions asked in a fair way? (response bias) Who commissioned the survey?
Statistical significance • So how do we know if an event really is interesting or if it was just random variation? • That’s what ‘statistical significance’ is about. • For example, is a cluster of cancer cases in an area suspicious or likely to be just natural variation?
League tables League tables are often meaningless because the natural variation is far bigger than the differences in the table
#4 There are many different ways of calculating an average. Which is the appropriate one to use?
Variation and distributions We often want to summarise a distribution of values with one number – an average. But there are different types of average: mean, median and mode.
Averages Average does not mean the same thing as typical. Different averages tell different stories – say which you are using.
Averages Median, £377 Mode, £275 Mean, £463
Averages Bottom line: Give an idea of the size and shape of the spread around the average.
Normal distribution 68.2% 95.4%
#5 How accurate are the figures?
“The number of people out of work rose by 38,000 to 2.49 million in the three months to June, official figures show.” GOLDACRE: “The estimated change over the past quarter is 38,000, but the 95% confidence interval is ± 87,000, running from -49,000 to 125,000. That wide range clearly includes zero, no change at all.”
#6 One change in the numbers does not make a trend. Blips often happen.
#7 Beware spurious connections that don’t amount to ‘a causes b’.
Correlation and causation A significant correlation between two variables does not imply one causes the other. Often there is a common cause for both variables, or it’s just a coincidence.
“Regression to the mean” The most abused correlation in the world!
#8 “One in a million”.
Probability and coincidences The chance of an event can be very small, but if it has lots of opportunities to happen, it can be near certain. Most weeks someone wins the lottery.
Probability Actually it’s only 133,000 to one… …and there are around 167,000 third children born in the UK each year. Always think about how many opportunities there were for a coincidence to happen “the chances… an astonishing 48 million to one”
#10 You should know what the absolute and the relative risk is, and communicate both.
Risk Google tells me…. diabetes,weight gain, cigarette smoke, HRT,solariums …all “double” my risk of cancer What, me worry?
Risk example “Bacon increases risk of colorectal cancer by 20%” But how bad is that?
Risk example About 5 out of 100 people develop colorectal cancer.
Risk example If all 100 ate 3 extra rashers every day... The number would rise to six