170 likes | 186 Views
Join Bill Sundstrom and Michael Kevane from the Department of Economics as they explore the power and beauty of teaching data analysis with R. Discover the arguments for and against using R in teaching, how they incorporate R in their courses, and their vision for adopting R as a standard across the university.
E N D
Adventures in teaching and learning data analysis with R Bill Sundstrom and Michael Kevane Department of Economics October 9, 2017
Agenda • What is R? • Arguments for and against using R in teaching • How we use R in teaching ECON 41/42: Data Analysis and Econometrics • Looking forward: How we could adopt R as a standard across the university and make it work • Illustrations of the power and beauty of R
What is R? • “R is a free software environment for statistical computing and graphics.” • Large user community: statisticians, academics, business users
Why teach with R? Instead of Excel, SPSS, Stata • Powerful: R does everything • Open-source: Free • Open-source: Dynamic • Script based • Replicability; collaborative verification and trouble-shooting; portability • Not easy, but not as hard as it looks • R is a great skill for our students!
Essentials of R • Download R • Download R Studio (interface) • Script based language • Easy to read in .csv files and data files off the Internet • ggplot command for graphing • Basic statistics very simple commands
Common concerns • “It’s too hard” • “It’s too easy” • “It’s too anarchic”
“It’s too hard” • Command-driven rather than pull-down menus • Scripts (command files) • BUT: These are virtues! • Most R commands are actually intuitive • Scripts are important: more efficient in long run, documentation, replication, robustness/sensitivity, sharing, algorithmic thinking
“It’s too easy” • The availability of canned routines and packages makes it too easy to use techniques without understanding them. • BUT: • This is a problem with any statistical package • You can do everything from scratch in R! (Example: OLS)
“It’s too anarchic” • The Wikipedia diss: How can we trust open source? • R is kind of “wild west” compared with Stata or Excel • Often multiple ways to do the same thing • Can we trust the user community? • BUT: • There are many sophisticated users vetting the most common routines and packages. • Welcome to the Silicon Valley!
Genuine issues • If we teach with R, downstream instructors will need to learn it… • … but it’s easy if you know any other stat software • How to make it even easier: • More adopters… tipping point
How we make it work • ECON 41: Data Analysis and Econometrics • Econ majors now take this in place of OMIS 41 • ECON 42: Data Analysis Applications • 2-unit lab course: R-based
Specific challenges • Many students have little programming exposure • Most of our students don’t know much about own computers, e.g.folder structure and where their files are • If R is taught in the context of a statistics course, double challenge of mastering both difficult statistical concepts and the software • Hence the lab section extremely helpful: partitions the cognitive load, creates a low-stress environment for learning the software
In the lab • We do not try to teach coding from scratch • We do insist that students run R from scripts • We provide sample scripts and tutorials for all the basic procedures they will need • Trouble-shooting for the first two weeks, then running regressions and interpreting results • Culminates in short data analysis project
Examples • Tutorial #6: Replicate a table from their textbook • Data analysis project: Replicate and reevaluate a classic economics article using up-to-date data • Barry Chiswick, “The Effect of Americanization on the Earnings of Foreign-Born Men” (JPE, 1978)
Continuing work… • Guide to R • Community of practice • Fall 17 : Mondays 4:00-5:15 • Econ major data analysis concentration • Psychology switching to R for intro stats • Goals: • Help desk in library with excellent and trained student assistants