600 likes | 708 Views
Understanding Lies, Damn Lies, and Statistics: A Look At Why So Many People Find Statistics Frustrating. John P. Holcomb, Jr. Cleveland State University Ohio MAA Section Meeting April 1, 2005. Outline . Why do statisticians find public reporting of statistics frustrating?
E N D
Understanding Lies, Damn Lies, and Statistics: A Look At Why So Many People Find Statistics Frustrating John P. Holcomb, Jr. Cleveland State University Ohio MAA Section Meeting April 1, 2005
Outline • Why do statisticians find public reporting of statistics frustrating? • Why does the public find statistics frustrating? • Why do students find statistics frustrating? • What are some major differences between statisticians and mathematicians? • Emphasize our similarities
"There are Three Kinds of Lies: Lies, Damn Lies and Statistics." • Attributed to Benjamin Disraeli (1804 - 1880) • Prime Minister (1868, 1874 -1880) • Said to be popularized by Mark Twain in the United States
Statistics Affirming Quotations • Frederick Mosteller (Harvard University) • “It is easy to lie with statistics, but it is easier to lie without them.”
What Drives Statisticians Nuts? Yahoo! News, (September 7, 2004)
Study Links TV to Teen Sexual Activity • “Teenagers who watch a lot of television with sexual content are twice as likely to engage in intercourse than those who watch few such programs.” (Reuters)
Rebecca Collins, “This is the strongest evidence yet that the sexual content of television programs encourages adolescents to initiate sexual intercourse and other sexual activities.” CAUSES
The problem is this is an Observational Study • Did not sit 1,792 adolescents down and force them to watch television • Adolescents chose their own “treatment”
Confounding • Occurs when some other variable(s) affects both the independent variable (TV watching) and the dependent variable (Sexual Activity) • Can be obvious and not-so-obvious • This is hard for statistics students when it is covered in class, but for the public …
Problem with All Observational Studies • Cannot assume there is no confounding • So critics always have opportunity to criticize observational studies • This is the defense of the Tobacco Industry for smoking
So why am I concerned? • There is no mention of the role of parental supervision • What is the consequence? • The public misguided on the meaning of the result
Experiments • Allow researchers to make “causal” conclusions • Randomly assign subjects to “treatments” and “control” to ensure balance • Control does not necessarily mean “sugar pill” • Both groups alike to every known variable as well as every unknown variable EXCEPT the treatment variable
Example II • July 9, 2002, The Journal of the American Medical Association releases the results of the “Women’s Health Initiative (WHI)” • Headlines Across America warned women about the risks from Hormone Replacement Therapy (HRT) • New York Times: Study Is Halted Over Rise Seen In Cancer Risk
Belief: Estrogen and Progesterone would help women live healthier lives Findings: • Increased risk for breast cancer (26%) • Increased risk of heart disease (29%) • Increased risk of Stroke (41%)
Previous Good News • 1962 – Observational study suggests estrogen therapy reduces risk of breast and genital cancers • 1980 – A study shows that estrogen and progesterone together reduce risk for endometrial cancer • 1985 – The Nurses’ Health Study, with 121,964 subjects finds lower rate of heart disease in those taking progesterone • 1995 – Same study finds that estrogen and progesterone reduce heart attack risk by 39%
Ethical Question • For the WHI can we deprive the control group this great treatment?
What Went Wrong? • One major issue – Nurses’ Health Study is observational • WHI is a clinical Trial • One theory is the confounder is health – healthier nurses took the HRT and stayed on the HRT • Another theory is the nature of the study – those who had some kind of heart ailment stopped taking medicine
Even though WHI was a clinical trial (experiment), informed consent can add bias • Also, Women in WHI were older (most were 60 or older instead of going through menopause)
Caution • Observational Studies are not useless • Often point to issues needing further investigation • Experiments • Animal Studies
What Did Not Make the Headlines (or Even the Article) • Recall the earlier increase: • Breast cancer (26%) • 8 more cases for every 10,000 women • For 8 to equal 26% increase then: P(Breast Cancer in Placebo Group) = 31/10,000 = .0031 P(Breast Cancer in the HRT Group) = 39/10,000 = .0038
Frustrations: • Difference between observational studies and experiments is subtle • For statisticians, there is no contradiction, but for the public and even scientists, there is a glaring contradiction • Confirms the culture of disbelief – and who is blamed? • There is inherent uncertainty in the process
Statistics is Perfect for the Law • Since all conclusions are based on probability – we can never say anything definitively • 0 and 1 are difficult to achieve ever in practice
Implications for Teaching • These are the topics we need to discuss • Study Design • Confounding and Causation • Treatment vs. Placebo • Absolute and Relative Risk • Uncertainty • “All models are wrong, but some are useful” • George Box (University of Wisconsin)
Further Implications • In the courses: • Introductory statistics • Statistical literacy • Mathematics for liberal arts • Statistical thinking will one day be as necessary a qualification for efficient citizenship as the ability to read and write. • H.G. Wells
Rational vs Emotional • Statistics and Mathematics have the perception of being rule enforcers • People do not like being told what to do or what not to do • We are constantly saying do not play the Lottery • My life is a personal failure
Mega Millions • July 2, 2004 • Mega-Millions jackpot reaches $290,000,000 • Probability of winning is .000000007399 = 7.399x10-9
Dr. Killjoy • 57 times more likely to die from a motor vehicle accident that day then win MegaMillions • 21 times more likely to die from lightening strike in a year than win MegaMillions
Why Do Students Find Statistics Frustrating? • Stilted Language • Recall an earlier phrase • “Cannot assume there is no confounding” • We are the masters of the double negative
Confidence Intervals • Students want to say • The probability the mean is in the interval is 95% • What we require them to say • “We are 95% confident the interval (a,b) captures the unknown population mean” • When drawing random samples from a population, calculating the intervals in this manner captures the unknown mean 95% of the time.
Hypothesis Testing Want to say “Accept Null” • Have to say “Fail to Reject Null” • (AND we make them put in context) • Again we statisticians can’t be certain (or accepting) of anything
3. Statistics Taught By Folks Who Are Not Trained Statisticians • Statistics was added “on the side” to their training • Not sure of the “why”, so it is difficult to motivate • Teaching statistics is “scraping the bottom of the barrel” in classroom assignments
“In God We Trust, All Others Bring Data” • W. Edwards Demming (TQM Guru)
At CSU, there are at least 7 different departments teaching some kind of introductory statistics comprising over 100 faculty • Only 4 faculty on campus have a Ph.D. in Statistics • At many schools that may be even lower
Differences Between Mathematics and Statistics • Statistics is too dirty • Mathematics is pure and pristine • Mathematics is built on axioms, definitions, and theorems • Statistics is built on “flawed” processes right from the very beginning
Giant Leaps of Faith • Assume the population is definable • Assume the population is stable • Assume the sample is representative (bias free) • If all this is true, then can we rely on Mathematics for our confidence interval to capture the mean 95% of the time.
Often mathematicians want “perfect” studies or nothing • “If you do not know what to measure, measure anyway, you’ll learn what to measure next time.” • David Moore (Purdue University) • Assessment
No Quod Erat Demonstrandum • I get a representative sample • The sample size is large enough to invoke the Central Limit Theorem • I calculate • I still do not know if my interval contains the unknown mean
ERGO • I have to wonder . . . • Mathematicians do not like uncertainty
Difference #2 • Applied Statisticians have to communicate with other researchers • These researchers often have limited statistical training • (Present company excluded), mathematicians are not exactly known for their patience with those deemed less worthy
The main challenge is to take a scientific hypothesis and turn into a testable statistical hypothesis • Have to convince researchers that input prior to collecting data is critical • Cleveland Cavaliers • Have to educate them not to “Stone the Messenger”
Difference #3 • Statisticians make more money • Statisticians have more job options
Try going www.idoproofs.com • Great Opportunities in Math • 101 Careers in Mathematics • http://www.maa.org
My Own History • BS in Mathematics • MS in Mathematics • Took Prelims in Real Analysis, Topology, Complex Analysis, and Math Stat • Would have gotten a Ph.D. in mathematics … • I do love Mathematics and Mathematicians • HONEST!