340 likes | 352 Views
Join Michael Hansen as he explores the theme of the evening, discussing the importance of statistics, biases, and the art of manipulating data. Discover how statistics can be used to prove or distort cause-and-effect relationships. Don't miss this eye-opening presentation!
E N D
Tonight’s Quantitative Thinking Presentation (Theme TBA Later) “There are three kinds of lies: Lies, d______ed lies, and statistics.” —attributed to Mark Twain Michael Hansen St. Albans School of Public Service Washington, DC June 29, 2006
My background . . . • Math and music (piano, violin, viola) • M.S., Applied Mathematics, UIUC • 1986–1998: U.S. Government contractor, primarily in the Pentagon • 1998–present: STA math teacher
Icebreaker Question • In what city was I born? (Hint: It is one of the ten largest cities in the U.S.)
Your Backgrounds . . . • Tell me something unusual that is not in your capsule bio
Which of these subjects are you most likely to use later in life? • Geometry • Algebra • Statistics • Calculus
Why are geometry, algebra, and calculus taught in high school? • Goodness only knows. • Life is cruel. • Math teachers are greedy and want full employment. • These subjects provide a chance to teach abstract reasoning skills.
mortar : brick :: glue : • stone • wood • water • oil
arachnid : spider :: marsupial : • lizard • kangaroo • fish • beetle
What have we been doing here? • Answer: Gathering statistics. • What is a statistic? • A number computed from data.
Which of these are examples of statistics? • A batting average • A vote tally for a political candidate • A census (headcount) for a community • A person who is injured in a traffic accident • The statement, “Most Americans support our troops in Iraq” • The statement, “51% of the adult voters polled stated that the war in Iraq has been a mistake” • The statement, “51% of adult American voters think the war in Iraq was a mistake”
Terms • Data = FACTS (plural); one fact is a datum • Statistic = a number computed from data • Parameter = a number that describes a population • Key idea: We use statistics to estimate parameters. • Bias = error in parameter estimation that is systematic (i.e., tending to one side or the other) • Sampling error = the inevitable result of trying to estimate parameters with a sample that is smaller than the entire population • m.o.e. = an estimate of sampling error (m.o.e. shrinks as sample size grows, assuming you have a random sample) • Note: Bias is avoidable. Sampling error is not.
Fun Chart: Who wins thiselection? (5.8 million votes cast) Statistics: Because of errors in the tabulation process, the m.o.e. for each candidate is at least several thousand votes, possibly as high as 60,000 votes. The initial tally shows Candidate A with a lead of 537 votes over candidate B.
Who do you think won? • Candidate A, by 537 votes (no recount) • Candidate A, maintaining a lead even after the recount • Candidate B, pulling ahead after the recount • Tie
Tonight’s Theme(Try to guess it from these examples.) • Space aliens stare at Earth through powerful telescopes. They observe that most of the people buying diet soda are a bit overweight, to put it mildly. Conclusion: Diet soda makes people fat. • A teacher enforces discipline in the classroom by deducting one point for each minor infraction (burping, tardiness, etc.) and tells a student who has lost several points, “Your actions have lowered your grade from A minus to B plus.” • Terrorists hijack airplanes and crash them into buildings, killing several thousand Americans. Spokesmen for the terrorists, as well as a number of international commentators, say that U.S. government policies are to blame. • A Roman army unit performs poorly in battle, with several of the men deserting in fear. After the battle, the commander decimates the troops (i.e., kills every tenth man present). The troops learn that if they desert, they will be sealing a death warrant for their friends still in the unit. • A village under Nazi occupation revolts. In reprisal, the Nazis murder nearly all the villagers, but they also lay waste to several neighboring villages. The revolts quickly cease. • A sheriff running for re-election proudly states, “During my four years in office, the violent crime rate has dropped by 28%. Vote for me to continue the progress!”
What is the theme? • Space aliens are not very bright • Correlation does not imply causation • Distortion of cause and effect • How to lie with statistics
Cause and Effect • A most interesting subject: Even babies are fascinated by it! • All public policy decisions hinge on cause and effect • Warning: Remember the “Law of Unintended Consequences” (examples of refrigerator subsidies, Australian rabbit invasion, AFDC) • Difficulties: • Mathematics has nothing to say on the subject. • We must turn to the relatively new science of statistics. • There is only one way to prove cause and effect, and the conditions are rarely met. • But . . . that does not stop politicians, journalists, and activists from asserting cause-and-effect relationships.
Making it Real—Iran Group • What economic statistics or intelligence statistics can be cited to show whether Iran is violating the NPT? • What can we infer about Iran’s timetable for acquiring a nuclear capability? What extrapolation, subject to what set(s) of assumptions, can be made upon these statistics? • What is the strategic effect upon Iran of the threat of massive nuclear retaliation? What quantitative evidence can we use, or is any evaluation simply a “gut feeling”? • Can economic sanctions or military strikes cause changes in Iranian policies? Which would be more effective and under what circumstances? • What would be the macroeconomic effects on the Iranian economy, other regional economies, and world commodities markets if Iranian oil supplies were disrupted? • What is U.S. public opinion toward putting economic and/or military pressure on Iran, and how is public opinion likely to shift over time? • Baseline statistics: Daily oil consumption of the U.S., key allies, China, India, and the world; daily oil production of Iran • Oil reserve estimates based on petroleum engineers’ statistics
Making it Real—Iran Group • Bottom line for this group: Your analysis will probably be qualitative, not quantitative. • Geopolitical interactions are grounded in economic and statistical realities, but predicting how things will play out is extraordinarily difficult. (Chaos theory?) • Diplomacy, nuance, gamesmanship, alliances, politics, and p.r. are more important factors. • Mathematicians are of little value in this realm. • Statisticians may be able to help with polling and econometric modeling.
Making it Real—Energy Group • The science of statistics is probably more useful for you than for the Iran group • Does CO2cause global warming? • Do other greenhouse gases cause global warming? • What fraction of global warming is caused by human activities (e.g., burning of fossil fuels)? • Will changing our behaviors cause a reduction or reversal of global warming? • What government policies on current and alternative fuels will cause changes in people’s behavior? • To what degree do U.S. energy policies cause changes in other countries’ economies or behaviors? • A high reliance on oil revenues in a country appears to be correlated with a low level of democracy and personal freedom. Is this correlation evidence of a cause-and-effect relationship, or can the correlation be explained by lurking variables?
Making it Real—Cause and Effect in the News • Do mercury-based preservatives in childhood immunizations cause autism? Many parents whose children suddenly became autistic shortly after having an immunization are convinced that this is so. • Did Vioxx cause heart attacks and strokes? If so, should Merck be required to pay damages to patients who suffered a heart attack or stroke after taking Vioxx? • Does second-hand cigarette smoke cause lung cancer or other diseases? If so, should smoking be banned in all indoor locations? • Do American high school classrooms cause boys to lag behind girls academically? If so, should curricula and teaching styles be revamped? What would be the effects of doing that? • Do silicone implants cause lupus and other autoimmune disorders in women? • Did Zicam cause people to lose their senses of taste and smell? • Do cell phones cause automobile accidents?
Some Common Fallacies • Anecdotal data • Emotional appeals or ad hominem attacks • COI, appearance of COI, or accusations of COI • “Post hoc, ergo propter hoc” (roughly translated: “after this, therefore caused by this”) • Similar: Correlation confused with causation • Extrapolation (i.e., assuming trends will continue) • Overanalyzing time series of uncontrolled systems (e.g., trying to predict the stock market by using “technical analysis”)
Are there any people who know what they’re doing? • Yes, a few. • Everyone should take a course in statistics. • A statistics course is one of the few courses where you are unlikely to study any actual statistics! (You will study the science and practice of statistics.) Actual statistics are seen in nearly all other courses: history, biology, sociology, etc.
“Of the 1300 randomly chosen adult American voters who were polled, 38% were satisfied with President George W. Bush’s performance in office. The margin of error is plus or minus 3%.” • I know exactly what the statement means. • I know quite well what the statement means. • I do not know what the statement means. • I do not even know whether or not I know what the statement means.
“Of the 1300 randomly chosen adult American voters who were polled, 38% were satisfied with President George W. Bush’s performance in office. The margin of error is plus or minus 3%.”Exactly what does this statement mean? • The true parameter is between 35% and 41%. • The true statistic is between 35% and 41%. • The true parameter is probably between 35% and 41%. • The true statistic is probably between 35% and 41%.
On the previous slide, why did the correct answer involve the word “probably”? • sampling error • bias • lack of randomness in the sample • sample size was too small for a believable survey of the entire nation
If you are making public policy decisions, why should you take nearly all news reporting (as opposed to news analysis or commentary) with a grain of salt? • sampling error • the most accurate reporting is usually internal to the government, not coming from the news media • media bias • anecdotal data
How to Talk Back to a StatisticSource: Darrell Huff’s classic 1950’s book, How to Lie With Statistics • Who says so? (Beware of COI.) • How does he or she know? (What methods were used to compute the statistic? Many numbers are simply unknowable in a practical sense.) • What’s missing? (Ask what the m.o.e. is, what assumptions were used, what time period was used, and whether there was a control group.) • Did someone change the subject? (Beware of “semiattached data,” “gee-whiz” graphs, and extrapolation.) • Does it make sense? Modern-day buzzterm: FACE VALIDITY.
Regarding Face Validity . . . • Use your common sense and read critically. Even reputable sources contain errors. • Compute ratios or per capita values. (For any national budget number, simply take the number of billions and multiply by 3 or 4. Example: A $50 billion program is costing each adult in the country about $200.) • Excerpt from The Washington Post Magazine on April 2, 2006 (posted on their website):Adult education is thriving nationwide, with more than 92 million adults taking college classes. At the nearly 70 two- and four-year colleges in the Washington area, an estimated 175,000 adults are enrolled, 40 percent of them on a part-time basis. • Which statistic in this excerpt lacks face validity?
What is a statistic? • a number that describes a population • a fact • a number computed from data • a person who is the victim of crime or an accident
What is a parameter? • a number that describes a population • a fact • a number computed from data • an adjustable constant, or a boundary condition for a problem
What is the only way to establish cause and effect? • a careful observational study • a careful observational study with a sufficiently large sample size and freedom from bias • an experiment • a controlled experiment
On a scale of 1 to 7, please rate how you feel about this statement: “I am planning to take a statistics course sometime within the next three years.” (Mark 7 if you have already taken a statistics course in high school, e.g., AP Statistics.) • Strongly Disagree • Disagree • Somewhat Disagree • Neutral • Somewhat Agree • Agree • Strongly Agree
Q and A Thank you for your time and attention! Michael Hansen(e-mail: modd “at sign” modd.net) St. Albans School of Public Service Washington, DC June 29, 2006