240 likes | 540 Views
Review of Coursera Data Analysis Course. Jim Thompson JamesThompsonC@gmail.com. To make sense of my comments…. Who’s the reviewer What is MOOC Overview of course (Through this reviewers eyes). The Reviewer (Who am I?). Not a professional data analyst: Chemist by training
E N D
Review of Coursera Data Analysis Course Jim Thompson JamesThompsonC@gmail.com
To make sense of my comments… • Who’s the reviewer • What is MOOC • Overview of course(Through this reviewers eyes)
The Reviewer (Who am I?) Not a professional data analyst: • Chemist by training • Develop and commercialize new materials and applications by profession. Not a data analysis layman • Data analysis as a hobby, on and off for 25 years. • Downloaded R, Jan 2009, used ever sinse “Data Analysts Captivated by R’s Power” The New York Times, January 2009 http://www.nytimes.com/2009/01/07/technology/business-computing/07program.html?pagewanted=all
How I taught myself R Whatever fancies me at the moment • No mentor, nor colleague • Books (> 10 on R), Internet articles, R vignettes • Learning by doing, mainly work data, for fun not for work. Because hobby, lacked discipline in: • Clean code • Reporting • Reproducible research • Appropriate use of stat technique
How I taught myself R Whatever fancies me at the moment • No mentor, nor colleague • Books (> 10 on R), Internet articles, R vignettes • Learning by doing, mainly work data, for fun not for work. Because a hobby, lacked discipline in: • Clean code • Reporting • Reproducible research • Appropriate use of stat technique
I tried Open University Excellent Teachers One hour long lectures Some class homework provided. No grading Complete at your own pace Intro to Programing , Stanford
I tried Open University Excellent Teachers One hour long lectures The class homework provided. No grading Complete at your own pace Don’t have one hour chunks of time. Nor the discipline. Intro to Programing , Stanford
“The Year of the MOOC”the New York Times [1] • A massive open online course (MOOC) is … aimed at large-scale interactive participation and open access via the web. [2] • www.Udacity.com • www.edX.org • www.Coursera.org • [1] http://www.nytimes.com/2012/11/04/education/edlife/massive-open-online-courses-are-multiplying-at-a-rapid-pace.html?pagewanted=all&_r=0 • [2] http://en.wikipedia.org/wiki/Massive_open_online_course
Data Analysis by Jeffrey Leek An applied statistics course focusing on data analysis, not mathematical details. How to: • Organize and perform analysis, • interpret results, • diagnose potential problems • write-up data analyses Statistical methods :
Data Analysis by Jeffrey Leek An applied statistics course focusing on data analysis, not mathematical details. How to: • Organize and perform analysis, • interpret results, • diagnose potential problems • write-up data analyses Statistical methods : Requires a working knowledge ofR
How does this work? • Time bond (i.e 6 weeks) • Plan on 3-10 hrs/wks • Watch three to five videos a week, 10-15 min long • Weekly quizzes • Submit two papers/reports • Slides, video, R code available for download • A certificate
Structure the analysis: Tips of finding, organizing, cleaning the data and the code. Week 2 Week 1 Personal comments:
Structure the analysis: Tips of finding, organizing, cleaning the data and the code. Very useful. Week 2 Week 1 Biggest Benefit I
Exploratory & Inferential:Clustering for exploratory analysis Week 3 Week 4
Inferential & Predictive Analysislearned new techniques, best practices Week 5 Week 6
Advanced TechniquesGood stuff, but I was running out of gas Week 5 Week 5
Submit Two Reports • Inference analysis of mortgage data:“This analysis considers whether any other variables have an important association with interest rate after taking into account the applicant's FICO score” • Predictive modeling using censors on cell phones:“Given the output Samsung phone, can we predict whether the owner is sitting, laying, standing, walking, walking up stairs, or walking down stairs.”
Submit Two Reports • Inference analysis of mortgage data:“This analysis considers whether any other variables have an important association with interest rate after taking into account the applicant's FICO score” • Predictive modeling using censors on cell phones:“Given the output Samsung phone, can we predict whether the owner is sitting, laying, standing, walking, walking up stairs, or walking down stairs.” • Biggest Benefit II • submitting mine, • analyzing others
Data analysis rubric • Main text • Does the analysis have an introduction, methods, analysis, and conclusions? • Are figures labeled and referred to by number in the text? • Is the analysis written in grammatically correct English? • Are the names of variables reported in plain language, rather than in coded names? • Does the analysis report the number of samples? • Does the analysis report any missing data or other unusual features? • Does the analysis include a discussion of potential confounders? • Are the statistical models appropriately applied? • Are estimates reported with appropriate units and measures of uncertainty? • Are estimators/predictions appropriately interpreted? • Does the analysis make concrete conclusions? • Does the analysis specify potential problems with the conclusions?
Data analysis rubric • Figure • Is the figure caption descriptive enough to stand alone? • Does the figure focus on a key issue in the processing/modeling of the data? • Are axes labeled and are the labels large enough to read? • References • Does the analysis include references for the statistical methods used? • R script • Can the analysis be reproduced with the code provided?
Final comments On MOOC • Thumbs up! On Data Analysis by Jeffrey Leek • Thumbs up! • Target audience: I might be the sweet-spot • Excellent reference (links attached). • On submitting reports: • Learned most by writing the reports and grading others NOTE: Intro to R course scheduled for September 2013
Data Analysis by Jeffrey Leek The Class • https://www.coursera.org/course/dataanalysis • https://github.com/jtleek/dataanalysis The Prof • http://www.biostat.jhsph.edu/~jleek/ • http://simplystatistics.org/