380 likes | 781 Views
Using StatCrunch in a Large Enrollment Course. Roger Woodard Department of Statistics NC State University. ST311. Introductory Statistics 700 students per semester Majors from social and biological sciences Not for business or engineering Sections Taught by graduate students
E N D
Using StatCrunch in a Large Enrollment Course Roger Woodard Department of Statistics NC State University
ST311 Introductory Statistics • 700 students per semester • Majors from social and biological sciences • Not for business or engineering Sections • Taught by graduate students • 10 to 12 per semester • 65 students per section No computer labs
GAISE • Guidelines • Assessment and • Instruction in • Statistics • Education
GAISE Recommendations Stress conceptual understanding rather than mere knowledge of procedures; Emphasize statistical literacy and thinking Use real data; Foster active learning in the classroom; Use assessments to improve and evaluate student learning; Use technology for developing conceptual understanding and analyzing data;
Use software? Requirements • Good graphics • Good statistics • Full range of procedures up through multiple regression • Minimal technical overhead • Easy to use • Low cost
Software? R • Free for everyone • Reasonable graphics • Must be installed, command line JMP • free for NCSU students • Good graphics • Must be installed
Software? Excel • everyone has it • Doesn’t do statistics well • Graphics good for some but not others • Not interactive
Statcrunch Does not need to be installed • Runs from within a web browser. • www.statcrunch.com Low cost • $12 per 6 months • Site license also possible Point and click interface, menu driven • Ease of use
Statcrunch Look at Statcrunch:
Interactive features Allow better understanding of statistical issues
Interactive features Interactive examination of outliers
Web based advantages Opens data from variety of sources • Computer • Websites • Paste Can be shared to Facebook, twitter, etc Can administer surveys Direct Data Link
Direct data link Instructors do not need to reload data sets Accessible from all computers across campus
Recommendations Concentrate on the statistics • Minimize the amount of work students need to do to use software • Avoid busy work • Easy links to get data into software • Build up techniques over several assignments • Avoid giving software a “bad reputation”
Recommendations Use video instructions • Students don’t read text documentation • Videos are easy • www.youtube.com • Segment video in small amounts. (30 seconds to 3 minutes) • Give videos based on tasks to perform.
Recommendations Ask questions that matter • Ask what real world conclusions can be gleaned Get students involved in the data • Use data sets that are understandable • Why would distribution look like it does? • Why would there be outliers? • What other sources of variability are there? Survey of students
Textbooks Why is the average around $400? Why would some be around $0? Why would some be up near $1000
Survey of students Car Age? • How old are students cars? • Do males or females have older cars?
Assessment Assessment should match what we want students to do. • Think about real world question • Use software to explore data
Data exploration problem John is a new college graduate working at his first job. After years of living in an apartment he has decided to purchase a home. He has found a great neighborhood from which he can walk to work. Before buying a home in the area he has decided to collect some data on the homes in this neighborhood. A data set has been compiled that represents a sample of 100 homes in the neighborhood he is considering. The variables included in this data set include: Value: the current value of the home as determined by the county tax assessor. • Size: the size of the home in square feet. • Year: the year the homes were built. • Basement: does the home have a basement (y=yes, n=no). • Fireplace: does the home have a fireplace (y=yes, n=no). • Type: the structure a single family house or a townhouse. (house or townhouse). Create histograms for each of the numeric variables and create bar charts for each of the categorical variables. Use these variables to explore the data and determine which of the following best fits this situation.
Data exploration problem The histogram for value is clearly bimodal. The reason it is bimodal appears to be because the homes in the neighborhood have higher priced single family houses and lower priced town homes. The histogram for value is clearly bimodal. The reason it is bimodal appears to be because the neighborhood was built in two phases, the newer phase consists of larger more expensive homes and the older phase consists of smaller less expensive homes. The histogram for value is clearly bimodal. The reason it is bimodal appears to be because the neighborhood was built in two phases, the older phase consists of larger more expensive homes and the newer phase consists of smaller less expensive homes. The histogram for value is clearly bimodal. The reason it is bimodal appears to be because the neighborhood has some homes that have basements that tend to be larger in size with another group of homes that do not have basements and tend to be smaller.
Web links GAISE report: • http://www.amstat.org/education/gaise/ Statcrunch: • www.statcrunch.com Video instructions: • Youtube.com • http://www4.stat.ncsu.edu/~woodard/statcrunch/