300 likes | 312 Views
Discover how to redesign introductory statistics courses for math majors, emphasizing real studies, genuine data, and active investigation. Explore statistical principles, computational tools, and mathematical underpinnings.
E N D
CAUSE Webinar:Introducing Math Majors to Statistics Allan Rossman and Beth Chance Cal Poly – San Luis Obispo April 8, 2008
Outline • Goals • Guiding principles • Content of an example course • Assessment • Examples (four) CAUSE Webinar
Goals • Redesign introductory statistics course for mathematically inclined students in order to: • Provide balanced introduction to the practice of statistics at appropriate mathematical level • Better alternative than “Stat 101” or “Math Stat” sequence for math majors’ first statistics course CAUSE Webinar
Guiding principles (Overview) • Put students in role of active investigator • Motivate with real studies, genuine data • Repeatedly experience entire statistical process from data collection to conclusion • Emphasize connections among study design, inference technique, scope of conclusions • Use variety of computational tools • Investigate mathematical underpinnings • Introduce probability “just in time” CAUSE Webinar
Principle 1: Active investigator • Curricular materials consist of investigations that lead students to discover statistical concepts and methods • Students learn through constructing own knowledge, developing own understanding • Need direction, guidance to do that • Students spend class time engaged with these materials, working collaboratively, with technology close at hand CAUSE Webinar
Principle 2: Real studies, genuine data • Almost all investigations focus on a recent scientific study, existing data set, or student collected data • Statistics as a science • Frequent discussions of data collection issues and cautions • Wide variety of contexts, research questions CAUSE Webinar
Popcorn and lung cancer Historical smoking studies Night lights and myopia Effect of observer with vested interest Kissing the right way Do pets resemble their owners Who uses shared armrest Halloween treats Heart transplant mortality Lasting effects of sleep deprivation Sleep deprivation and car crashes Fan cost index Drive for show, putt for dough Spock legal trial Hiring discrimination Comparison shopping Computational linguistics Real studies, genuine data CAUSE Webinar
Principle 3: Entire statistical process • First two weeks: • Data collection • Observation vs. experiment (Confounding, random assignment vs. random sampling, bias) • Descriptive analysis • Segmented bar graph • Conditional proportions, relative risk, odds ratio • Inference • Simulating randomization test for p-value, significance • Hypergeometric distribution, Fisher’s exact test • Repeat, repeat, repeat, … • Random assignment dotplots/boxplots/means/medians randomization test • Sampling bar graph binomial normal approximation CAUSE Webinar
Principle 4: Emphasize connections • Emphasize connections among study design, inference technique, scope of conclusions • Appropriate inference technique determined by randomness in data collection process • Simulation of randomization test (e.g., hypergeometric) • Repeated sampling from population (e.g., binomial) • Appropriate scope of conclusion also determined by randomness in data collection process • Causation • Generalizability CAUSE Webinar
Principle 5: Variety of computational tools • For analyzing data, exploring statistical concepts • Assume that students have frequent access to computing • Not necessarily every class meeting in computer lab • Choose right tool for task at hand • Analyzing data: statistics package (e.g., Minitab) • Exploring concepts: Applets (interactivity, visualization) • Immediate updating of calculations: spreadsheet (Excel) CAUSE Webinar
Principle 6: Mathematical underpinnings • Primary distinction from “Stat 101” course • Some use of calculus but not much • Assume some mathematical sophistication • E.g., function, summation, logarithm, optimization, proof • Often occurs as follow-up homework exercises • Examples • Counting rules for probability • Hypergeometric, binomial distributions • Principle of least squares, derivatives to find minimum • Univariate as well as bivariate setting • Margin-of-error as function of sample size, population parameters, confidence level CAUSE Webinar
Principle 7: Probability “just in time” • Whither probability? • Not the primary goal • Studied as needed to address statistical issues • Often introduced through simulation • Tactile and then computer-based • Addressing “how often would this happen by chance?” • Examples • Hypergeometric distribution: Fisher’s exact test for 2×2 table • Binomial distribution: Sampling from random process • Continuous probability models as approximations CAUSE Webinar
Content of Example Course (ISCAM) CAUSE Webinar
Assessments • Investigations with summaries of conclusions • Worked out examples • Practice problems • Quick practice, opportunity for immediate feedback, adjustment to class discussion • Homework exercises • Technology explorations (labs) • e.g., comparison of sampling variability with stratified sampling vs. simple random sampling • Student projects • Student-generated research questions, data collection plans, implementation, data analyses, report CAUSE Webinar
Example 1: Friendly Observers • Psychology experiment • Butler and Baumeister (1998) studied the effect of observer with vested interest on skilled performance • How often would such an extreme experimental difference occur by chance, if there was no vested interest effect? CAUSE Webinar
Example 1: Friendly Observers • Students investigate this question through • Hands-on simulation (playing cards) • Computer simulation (Java applet) • Mathematical model • counting techniques CAUSE Webinar
Example 1: Friendly Observers • Focus on statistical process • Data collection, descriptive statistics, inferential analysis • Arising from genuine research study • Connection between the randomization in the design and the inference procedure used • Scope of conclusions depends on study design • Cause/effect inference is valid • Use of simulation motivates the derivation of the mathematical probability model • Investigate/answer real research questions in first two weeks CAUSE Webinar
Example 2: Sleep Deprivation • Physiology Experiment • Stickgold, James, and Hobson (2000) studied the long-term effects of sleep deprivation on a visual discrimination task (3 days later!) sleep condition n Mean StDev Median IQR deprived 11 3.90 12.17 4.50 20.7 unrestricted 10 19.82 14.73 16.55 19.53 • How often would such an extreme experimental difference occur by chance, if there was no sleep deprivation effect? CAUSE Webinar
15.92 Example 2: Sleep Deprivation • Students investigate this question through • Hands-on simulation (index cards) • Computer simulation (Minitab) • Mathematical model p-value=.0072 p-value .002 CAUSE Webinar
Example 2: Sleep Deprivation • Experience the entire statistical process again • Develop deeper understanding of key ideas (randomization, significance, p-value) • Tools change, but reasoning remains same • Tools based on research study, question – not for their own sake • Simulation as a problem solving tool • Empirical vs. exact p-values CAUSE Webinar
Example 3: Infants’ Social Evaluation • Sociology study • Hamlin, Wynn, Bloom (2007) investigated whether infants would prefer a toy showing “helpful” behavior to a toy showing “hindering” behavior • Infants were shown a video with these two kinds of toys, then asked to select one • 14 of 16 10-month-olds selected helper • Is this result surprising enough (under null model of no preference) to indicate a genuine preference for the helper toy?
Example 3: Infants’ Social Evaluation • Simulate with coin flipping • Then simulate with applet
Example 3: Infants’ Social Evaluation • Then learn binomial distribution, calculate exact p-value
Example 3: Infants’ Social Evaluation • Learn probability distribution to answer inference question from research study • Again the analysis is completed with • Tactile simulation • Technology simulation • Mathematical model • Modeling process of statistical investigation • Examination of methodology, further questions in study • Follow-ups • Different number of successes • Different sample size
Example 4: Sleepless Drivers • Sociology case-control study • Connor et al (2002) investigated whether those in recent car accidents had been more sleep deprived than a control group of drivers CAUSE Webinar
Example 4: Sleepless Drivers • Sample proportion that were in a car crash • Sleep deprived: .581 • Not sleep deprived: .484 Odds ratio: 1.48 • How often would such an extreme observed odds ratio occur by chance, if there was no sleep deprivation effect? CAUSE Webinar
1.48 Example 4: Sleepless Drivers • Students investigate this question through • Computer simulation (Minitab) • Empirical sampling distribution of odds-ratio • Empirical p-value • Approximate mathematical model CAUSE Webinar
Example 4: Sleepless Drivers • SE(log-odds) = • Confidence interval for population log odds: • sample log-odds +z* SE(log-odds) • Back-transformation • 90% CI for odds ratio: 1.05 – 2.08 CAUSE Webinar
Example 4: Sleepless Drivers • Students understand process through which they can investigate statistical ideas • Students piece together powerful statistical tools learned throughout the course to derive new (to them) procedures • Concepts, applications, methods, theory CAUSE Webinar
For more information • Investigating Statistical Concepts, Applications, and Methods (ISCAM), Cengage Learning, www.cengage.com • Instructor resources: www.rossmanchance.com/iscam/ • Solutions to investigations, practice problems, homework exercises • Instructor’s guide • Sample syllabi • Sample exams CAUSE Webinar