1 / 11

S-052 Shopping – Applied Data Analysis

S-052 Shopping – Applied Data Analysis. Andrew Ho Harvard Graduate School of Education Tuesday, January 22 , 2013. Disciplined Perception: Experts vs. Novices. What You’ve Learned. A single outcome variable. A single predictor variable…. Multiple predictor variables.

ghazi
Download Presentation

S-052 Shopping – Applied Data Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. S-052 Shopping – Applied Data Analysis • Andrew Ho • Harvard Graduate School of Education • Tuesday, January 22, 2013

  2. Disciplined Perception: Experts vs. Novices

  3. What You’ve Learned A single outcome variable A single predictor variable… Multiple predictor variables Interactions: Products of predictors May be transformed to achieve linearity Continuous, interval scaled (noncategorical) Quadratic/Polynomial Regression for nonlinear relationships May be transformed to meet regression assumptions of normally distributed residuals May be dichotomous or polychotomous Independent and identically normally distributed residuals centered on 0

  4. What you will learn: The S-052 Roadmap What you will learn: The S-052 Roadmap Use influence statistics to detect atypical datapoints Test for residual normality Multiple Regression Analysis If your residuals are not independent, replace OLS byGLS regression analysis If you have more predictors than you can deal with, If your outcome is categorical, you need to use… If your outcome vs. predictor relationship isnon-linear, Specify a Multilevel Model Create taxonomies of fitted models and compare them. Discriminant Analysis Binomiallogistic regression analysis(dichotomous outcome) Multinomial logistic regression analysis(polychotomous outcome) Form composites of the indicators of any common construct. Are the data longitudinal?Use Individual growth modeling Use non-linear regression analysis If time is a predictor, you need discrete-time survival analysis… Use Cluster Analysis Do your residuals meet the required assumptions? Transform the outcome or predictor Conduct a Principal Components Analysis

  5. 8 Units What you will learn: The S-052 Roadmap Taxonomies of Regression Models Nonlinear Regression Nonindependent Residuals 4. Logistic Regression 5. Discrete-Time Survival Analysis 6. Forming Composites 7. Cluster Analysis 8. Factor Analysis

  6. Disciplined Perception: Gender in Math Instruction http://www.edweek.org/ew/articles/2013/01/16/17gender.h32.html http://ftp.iza.org/dp6453.pdf

  7. Disciplined Perception: Massively Open Online Courses

  8. The Flow of S-052. Two steps forward. One step back. Principal components?! Whoa, fixed and random effects? Ack, Discrete Time what now? This sounds familiar! Final project Clustering… seems intuitive Logistic regression isn’t so ba- Scared!

  9. How you’ll spend your time in S-052, Part I: What we’ll do in class Each unit has a three-part structure Lectures with your questions: Active participation is encouraged, time permitting • I. Research Questions and Data Sets • What predicts attrition in massively open online courses? • Do teacher qualifications have a particularly strong impact when female teachers teach girls? • What are the common characteristics of Academy Award winning actors and movies over their competition? Note-taking: On laptops (in laptop zones at the edges or the back of the lecture hall) or printouts of handouts • II. Delve into the new statistical content that the RQs (and the unit) demands • What aspect of the model do we need to learn more about? • How do we represent this aspect of the model algebraically & graphically? • What assumptions are we making (and how do we evaluate whether these make sense?) Please be courteous: No cellphones, email, websurfing, IM, texting or other electronic distractions during class • III. Interpreting & presenting results • How do we interpret computer output? • What conclusions can we draw—and what conclusions don’t necessarily follow? • How do we write up our results—in words, graphs, tables, PowerPoints? • How do we communicate results to both technical and non-technical audiences?

  10. How you’ll spend your time in S-052, Part II: What you’ll do outside of class • Assignments • Six homework assignments, consisting of one or more datasets & questions that guide you through a complete analysis (1/2 of your grade). Submitting assignments in pairs is mandatory for all assignments! • One final exam, completed individually, will give you a chance to review all the material in a comprehensive series of analyses (1/2 of your grade). • Individual and group work • Our strong emphasis on collaboration is a reflection our philosophy that learning statistics is like learning a language and must therefore be spoken actively and in a participatory context. • Also reflects the realities of today’s team-driven statistical practice. • Work in study groups as you’d like, but write and submit HWs as pairs. • The final exam must be completed individually. • Weekly Sections • All students will have a “homeroom” section and TF on Tuesday, Wednesday, or Thursday afternoon, to be scheduled via a doodle poll. • Sections both reinforce and supplement lecture content. There will be Stata labs, additional examples, and opportunities for questions. • Attendance is not mandatory but strongly, strongly encouraged. Course website: http://isites.harvard.edu/icb/icb.do?keyword=k92522 Instructor Office Hours: http://andrew-ho-office-hours.wikispaces.com

  11. Six things you should do before the first class meeting, next Tuesday • 5. Decide how you want to access Stata • Visit the LTC on Gutman 3 • Google “HGSE ordering Stata” • Think about whether it makes sense for you to purchase a Stata license. • 1. Make sure you have the prerequisites • A solid regression class (S-030, S-040, or equivalent) • Experience fitting regression models with statistical software (Stata or other) 3. Read the School’s policy on plagiarism All written work submitted is to be in your ownwords or those of your partner. • 4. Familiarize yourself with the S-052 website • Bookmark the site: http://isites.harvard.edu/icb/icb.do?keyword=k92522 • Read the syllabus—it includes many more details and represents our learning contract. 2. Register for the course: http://www.gse.harvard.edu/about/administration/registration/cross_registration.html Note that GSAS, HBS, HLS, HMD, HSDM, GSD, HDS and HPSH students must fill out a new online cross-registration form. • 6. Get used to accessing the handouts before class. • I’ll be posting the 1st handout to the website before class next week. • You don’t have to read it; but you may find it helpful to bring it. Hope to see you next Tuesday, 10AM, in Larsen G08!

More Related