120 likes | 361 Views
Using StatCrunch to Teach Statistics Using Resampling Techniques. Webster West Texas A&M University . Background. George Cobb started the interest in using resampling methods for introductory statistics with his plenary talk at the First USCOTS in 2007.
E N D
Using StatCrunch to Teach Statistics Using Resampling Techniques Webster West Texas A&M University
Background • George Cobb started the interest in using resampling methods for introductory statistics with his plenary talk at the First USCOTS in 2007. • Several groups our now working on integrating these approaches into the curriculum. • I have added numerous resampling procedures to StatCrunch which is widely used in teaching introductory statistics. • Roger Woodard and I have our INCIST NSF grant to develop teaching materials which incorporate these methods. • We have conducted numerous workshops around the country where we have presented these materials to statistics teachers.
A Randomization Activity • Students were randomly assigned to a version of an exam (Yellow or Green) when they entered the classroom. • Afterwards, both sets of students complained about the exam saying their version was harder. • Students investigate the possibility that the observed difference between means of 6.3 might occur due to random chance. • They shuffle cards with scores written on them into yellow and green groups and then calculate the difference between the two means. • They place a post it note on a whiteboard in the proper location and evaluate the resulting randomization distribution.
A Sampling Distribution Activity • A very inconvenient printed roster of 12,000 students at a fictitious university is provided to the students. • They build a sampling distribution by each collecting a random sample of size 30 and reporting the mean number of Facebook friends for their sample via a StatCrunch survey. • From the sampling distribution, students see the normal curve is a good descriptor of the sampling variability of the sample mean. • This leads to the CLT and the idea of statistic ± 2×standard error as a 95% confidence interval for an unknown population mean.
A Sampling Distribution Activity Mary’s Sample Mean
A Sampling Distribution Activity • The instructor then uses their access to the data in electronic form within StatCrunchto compute 1000 sample means with each sample mean based on a sample of 30 students. • This is like doing the activity with 1000 students instead of 10.
A Bootstrapping Activity • Building off the sampling distribution activity, students are then tasked with estimating the standard error of the sample mean using a single sample. • Using the sample as a proxy for the population, each student collects 30 resamples taken with replacement from the common sample data. • Each student reports the mean of their bootstrap sample. • The student results are then augmented with applet results to compute a 95% confidence interval for the population mean.
What we have learned about the randomization approach • This approach appeals nicely to a basic intuition that most people have about the problem. • The tactile simulation adds a great deal of value in terms of students understanding of what is taking place in the applet. • It can be introduced with little or no background required in terms of other statistical concepts such as normal theory or even the jargon of hypothesis testing. • It can be easily used at a variety of time points in the standard introductory course. • Students and instructors seem to really take to it!
What we have learned about the bootstrapping approach • The bootstrap can be used to reinforce this idea of a sampling distribution. • The bootstrap approach requires a great deal of backstory before it can be effectively introduced. • It is probably best to rely on technology alone for the bootstrap after the student does the sampling activity in a more tactile way. • Students seem to like it but instructors not so much! • The bootstrap may also lead to possible misconceptions on the part of students: “Am I getting a confidence interval for the sample mean?”
For discussion • With these approaches, two people can get different results even when using the same data set. • Taking a simple random sample from a large list of values is difficult to do in a tactile way. Must we always rely on help from the computer? • How can we make one sample problems more interesting to students? • Students are interested in samples but not interested in population parameters! Why do we care about the mean number of Facebook friends? • Are we too focused on inference in the introductory course? How often will the average student ever be confronted with a random sample?