150 likes | 166 Views
Learn how to avoid being fooled by statistics by understanding the potential pitfalls and tricks used to misrepresent data. Gain valuable insights into data collection, analysis, reporting, and visualization to make more informed decisions.
E N D
FOOLING BY STATISTICS 5 Ways to Avoid Being Fooled By Statistics by Jiafeng Li on August 8, 2013 in Market Research http://www.iacquire.com/blog/5-ways-to-avoid-being-fooled-by-statistics and http://www.webmechanix.com/data-misrepresentation-issues-marketing-agencies
If something were to happen to the validity of our data, then the outcome of our decision-making would be affected accordingly. • There are so many ways statistics can be wrong since statistics come from data. From data to statistics there are processes like: • data collection • data entry • data analysis • data reporting • data visualization • For different stages, there are chances of malpractice. For example, the way of data collection may be biased; errors may occur during data entry; the data analysis may be misrepresented and flawed; the results of data analysis during data reporting may be misinterpreted; the data visualization may be misleading.
How Data Misrepresentation Can Cost You Thousands (Or More!)
Graphical Misrepresentation of Data • One of the easiest ways to make sense of large data sets is with a visual aid. These visual aids include things like graphs and charts. While helpful for reporting, visual representations of data can be very misleading if used improperly. • Below an example is created using the number of “leads” generated over the course of 10 weeks. NOTE: Assume week 8 is simply an anomaly. There were no extra marketing efforts made, just one great, random, week.
A screenshot from Fox News in 2009 What?! The statistics in the pie chart add up to 167%? Isn’t it supposed to be 100%? If you see a chart like this, don’t make any guess, just discard it!
Did you catch it? There is no labels on the x-axis. We have no clue where it starts. But how scary it looks. It zooms up from some point in the bottom to 9.9%. Oh my god! The prices are going up and times are bad.
This is a new trend. The presenter wants to show that the sales of Brand X has doubled. The height of the second image is double that of the first. So what’s wrong?The flaw is, when we increase the height by two times, the width also goes up two times. Even though the label says 40 million, the second image is 4 times bigger than the first. Hence to the eye and the mind the growth 'looks' much more than what it is.
Advertisements like to manipulate consumers’ minds with statistics • Look at the advertisement by AT&T below. But, really, don’t believe it, unless you are provided with a detailed report on it. You really don’t know how and where and whom they collect the data from. These factors can make very different results. So, statistics like these are also not convincing.
The last figure of 9.01% looks like a big jump from 8.06%. Inflation has shot up! Wait, where does the vertical x-axis start? 7. Should it not start at 0? This is what we were taught in school.Here lies the trick. To make the jump significant, set the axis at 7. If you actually keep the axis at zero, the jump will not look high and hence will not be a 'saleable story' and will never make the front page.
A statistic without a source is useless. If the source is provided, always check the authority of the source. • Credible statistics look like these:
Sampling Bias • If samples are not representative, the statistics will be biased. So it is always a good practice to check the sample size. If the sample size is too small, the results will be easily biased. • During data collection, there are possibilities of sampling bias: unrepresentative demographics, unrepresentative geographic locations, etc. With sampling bias, the results of data would be of no value or very little value since they can be quite different from what the actual world is like. • The presidential election in 1936 between Roosevelt and Landon? The Literary Digest Magazine, one of the most respected magazines at that time predicted that Landon would win the election by a large margin while the real election results turned out to be the opposite. The cause of this is sampling bias. The Literary Digest Magazine polled over 10 million people and received 2.4 million responses. Those who responded to the poll were mostly upper class people who are more likely to vote for Republican candidate.
Statistics That Are Skewed Purposely • Even with correct data results, statistics can be misinterpreted. In this case, you will see wrong conclusions drawn from accurate data analysis results. On the other hand, some statistics are skewed or exaggerated visually to make them serve the author’s purposes. In this part, we will address the issues raised from the stages of “data reporting” and “data visualization.”
GAS PRICES • Fox Chart Showed Gas Prices Were Consistently Rising. On February 20, Fox News displayed a graphic that used three random data points: One was the national average gas price from the day the graphic aired, the other two were chosen from the previous week and the previous year. From Fox News' America's Newsroom: • In Reality, Fox Cherry Picked Data To Hide Fact That Fluctuating Gas Prices Had Fallen From High Points. An accurate representation of gas prices over the 12-month period starting in February 2011 showed that gas prices in February 2012 -- the highest point on Fox's graphic -- were actually down from their high in April-May of 2011. From AAA:
Misinterpretation and Logical Fallacies • The conversation below is what I heard from a couple: • Boyfriend: You’re cool when you’re drunk.Girlfriend: So I am not cool when I am not drunk?!Boyfriend: WTF?? • This is a typical logical fallacy: using a proposition against the original propositions while the two propositions are not collectively exhaustive. Collectively exhaustive means one of the two propositions must happen and there are no other possibilities of other events. However, “cool when drunk” and “not cool when not drunk” are not collectively exhaustive. “Cool when not drunk” can also be a possibility. So “girlfriend” just eliminates the “cool when not drunk” proposition. • When interpreting the data results, some people also made some logical fallacies like the above example. When interpreting 37% of New York City citizens have gone to Central Park once, a conclusion like “this indicates 63% of NYC citizens have never been to Central Park” is incorrect. 0 and 1 are not collectively exhaustive. There are possibilities of having been to Central Park for 2 times, 3 times, etc. So, 63% not only includes those who have never been to Central Park, but those who have been there multiple times. Whenever you see some interpretation like this, be mindful of the logical fallacies problem.
A video of Oxford mathematician Peter Donnelly. reveals the common mistakes humans make in interpreting statistics -- and the devastating impact these errors can have on the outcome of criminal trials. Let’s watch the video: https://www.youtube.com/watch?v=kLmzxmRcUTo