1 / 46

Some Controversies of Statistics Education and Practice

Some Controversies of Statistics Education and Practice. Do we teach the right stuff? . Larry Weldon. Does it matter what we teach?. Just mental exercise? Content not so crucial? But modern statistics is a new subject Need new tools, concepts, culture . Overview of talk.

giza
Download Presentation

Some Controversies of Statistics Education and Practice

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Some Controversies of Statistics Education and Practice • Do we teach the right stuff? Larry Weldon

  2. Does it matter what we teach? • Just mental exercise? • Content not so crucial? • But modern statistics is a new subject • Need new tools, concepts, culture

  3. Overview of talk • The role of parametric inference • Is it declining in favor of data analysis? • The practice of statistics • Are we serving practitioners? • Problems of pedagogy • Do our students learn what we intend?

  4. Part I: Focus on Parametrics Is it still appropriate? More Parametric Modeling? Less Parametric Inference?

  5. Ex 1: A time series

  6. Something we should do? Teach more smoothing and time series at an early stage

  7. Ex 2: Modeling Variability

  8. Ex 3: Regression Setting: Approximation Model EG: Predict House Price ($,000) from Square Feet And Lot Size In South Delta, Price = -200 + 0.1*LTSZ + 0.1*SQFT In North Delta Price = -350 + 0.067*LTSZ + 0.067*SQFT

  9. Linear Model Useful?

  10. Why do we focus on parametric inference? Before Computers for Graphics and Simulation Need for Data Reduction Pre-computer: Intense Interest in “best” methods for estimating parameters …. e.g. unbiasedness criterion

  11. Ex 4. Unbiasedness • Being exactly right, on average! • Better to be a close often? • E.G. Estimation of 2 MMSE estimator?

  12. Normal Model

  13. Expo Model

  14. MMSE Estimator? • Does MSE really tell us what we want to know about our estimator of VARiance? • What is distribution of signed error of estimate of VAR?

  15. Typical Error or Whole Dist’n? • MSE measures typical error. • Distribution of error is more informative & easy to report. • Whole distributions often do not need parametric summary! Use Graph.

  16. Ex 6. Does Variance measure Variation? • E.g. Variance of Yield in Bushels Squared?

  17. Analysis of Variance: SST=SSR+SSE How does it compare with Analysis of SD ? Is R-squared a ratio of useful units? Is “64% of variance” as useful as “80% of SD”?

  18. Anova Table • DF Sum Sq Mean Sq F value Pr(>F) • block 5 343.29 68.66 4.4467 0.015939 * • N 1 189.28 189.28 12.2587 0.004372 ** • P 1 8.40 8.40 0.5441 0.474904 • K 1 95.20 95.20 6.1657 0.028795 * • N:P 1 21.28 21.28 1.3783 0.263165 • N:K 1 33.14 33.14 2.1460 0.168648 • P:K 1 0.48 0.48 0.0312 0.862752 • Residuals 12 185.29 15.44

  19. Variance? • Students need to know squared units are weird!

  20. Role of Simulation • Exploring intractable strategies • Exploring model estimates • Calibrating complex models to match outcome data One use of parametric models is to do simulations. But this is Different than “inference” as we usually teach it.

  21. Traffic Demo • Accordion Effect in heavy highway traffic • Thanks to Andrej Blejec for teaching me R

  22. Ex 7: Traffic Accordion • Simple Rule Adjust speed to allow 2.5 seconds gap (and add a little noise) Uses only simple models. Go to R …

  23. Use of Parametric Models For simulation! One reason why applied prob’y modeling is so useful.

  24. Part II: Needs of Stats Practice Students prepared for practice? Preparation for fast learning of applied stats?

  25. Target Students? Student populations in stats • 1000 in first course • 250 in second course • 100 in third course • 50 in fourth course Most students take only 1-2 courses!What goals for the 1000 + 250? = 90% What goals for the 100 + 50? = 10%

  26. Quote from Cleveland(1993) A very limited view of statistics is that it is practiced by statisticians. … The wide view has far greater promise of a widespread influence of the intellectual content of the field of data science.

  27. Service vs Mainstream • Service = anyone more interested in applications than developing new techniques (90 %?) • Mainstream = enabling development of new techniques (10 %?) • Various Levels for each ….

  28. First Year Course Either • “stat appreciation” (service) or • “stat strategies” (mainstream)

  29. Second Year Course Either (Service) • How to read data-based research papers Or (Mainstream) • Regression, Data Analysis, and some Experiment Design

  30. Third Year Course • Mainstream and Service • Design of Experiments • Probability Models and Parametric Inference • Sampling Surveys • Software Options • Multivariate

  31. Fourth Year Course Mainstream only: • Linear Models • Bayesian Methods & inference options • Math-Stat • Advanced Graphical Methods

  32. Changes? • The courses we have do allow these things • Most radical suggestions are at lower division • Some minor (?) suggestions for LD and UD …

  33. Gripe 1: Decision Making vs Statistical Significance • Significance = (In-)Credibility of NullNot really decision-making machineryyet “Type I and Type II errors” suggests decisions are being made. • Decision making requires Loss Fcn Priors

  34. Gripe 2: Data: By Design or Serendipity? Purpose of Analysis = Purpose of Data Collection? e.g. Designed Expts, Some Observational Studies Purpose of Analysis ≠ Purpose of Data Collection e.g. Serendipity -> Data Mining Inference Sample->Population ?

  35. Gripe 3: Role of Graphics • Preliminary Data Analysis and Screening • Model Analysis • Model Testing (via Residual Plot) • Model Fit Result (Data + Fit) (Graph enhances other methods) • What if no model? • As in non-par smooth fit • As in simulation relationship(Graph is only way to show result)Enhanced Role of Graph as Result Report

  36. Gripe 4: SPC • Our STAT 340/440 used to teach some ideas that are basic stats • Management by exception &response costs • Incremental improvement (QC, EVOP ideas) • Alternative variability measures • Role of industrial experiments (robust design)

  37. Part III: Pedagogy • Logical Sequence vs Case Studies • Logical 1 var, 2 var, 3 var, … • 0-1 data, categorical, ordinal, interval, … • Case study approach • Spatial patterns, time series variability, smoothing, biological diversity, …

  38. Tests and Exams • Determines what students learn • What do we want students to learn? • How and What? or Why and When?Do we ask students to • Explain to a Prof & TA, or to a Peer or Lay? • Hand Calculation or Software Output interpretation? • Memorize (Closed Book) or Understand (Open Book)?

  39. Common Sense • How does it fit with stat culture? • Stat as the tool of Inference Police. • Never assume something is simple • Never jump to conclusions • Never assume naive thinking will help • Are students afraid to use their own “common sense”? • Maybe Stat as Discovery Tools

  40. Changes? • More conceptual approach? • More simulation? • More graphics? • More admission of parametric limitations? • More options for inference? • More creativity? • More data analysis? • More time series, and decision tools?

  41. What Less? • Math-Stat, optimization, lin. models • Parametric Inference (but more modeling) • Least Squares • Unbiasedness • Hand reproduction of stat package results (even at lower division)

  42. Summary • More context-specific data analysis • Less focus on parametric inference • Better use of simulation and graphics

  43. “The question I wish to raise is whether the 21st century statistics discipline should be equated so strongly to the traditional core topics and activities as they are now. Personally I prefer a more inclusive interpretation of statistics that reflects its strong interdisciplinary character.” Kettenring (1997) Former ASA President

  44. Thank you for listening!Your Comments Please!

  45. Normal Model

  46. Expo Model

More Related