140 likes | 257 Views
Class 21. Social Statistics (I). Class Outline. Descriptive Statistics Introduction to Regression Discussion of Readings Criminal Violence in NFL Players Sex and Sports. Descriptive Statistics. Used to summarize data being studied. Can be used to summarize the following:
E N D
Class 21 Social Statistics (I)
Class Outline • Descriptive Statistics • Introduction to Regression • Discussion of Readings • Criminal Violence in NFL Players • Sex and Sports
Descriptive Statistics • Used to summarize data being studied. • Can be used to summarize the following: • Distribution of attributes on a single variable. • Associations between variables in the form of correlation matrix or cross-tabulation. • Two purposes: • Introduce readers to the study sample. • Prepare readers for multivariate analysis.
Introduction to Regression Analysis • A statistical technique for characterizing the pattern of a relationship between 2 or more variables in terms of a linear or nonlinear equation. • General form of linear regression: Y = a + b*X + e • Y: dependent variable • X: independent variable • a: intercept. The mean value of Y when X is 0. • b: slope. The amount of change in Y associated with 1 unit change in X. • e: error, also called residual. It is the difference between the predicted and the observed value of Y.
mpg = 39.4 - 0.006 * weight + e Prediction: Based on this equation, what do you expect the mean value of mpg to be for a vehicle weighing 3000 lbs? This particular vehicle has a large positive residual, meaning that it has a much larger mpg than its weight would predict. Large negative residual, much less mpg than expected based on its weight Example: MPG and Vehicle Weight
Uses of Regression Analysis • General use -- To summarize correlation between variables. • To describe the causal relationship between variables. • Can be used to predict the value of Y based on the value of X.
What Can Go Wrong with Regression? • Don’t fit a straight line to a nonlinear relationship. • Beware of outliers. • Don’t extrapolate beyond the data. (Interpolation is usually okay.) • Don’t infer that x causes y just because there is a good linear model for their relationship.
Regression to the Mean • Sir Francis Galton first called line-fitting regression. • Extremely tall people as a group are likely to have children shorter than themselves, and extremely short people as a group are likely to have children taller than themselves. • Examples • Son’s height and father’s height • Pretest and posttest • Sophomore jinx
Criminal Violence in NFL Players • Where did the data for this study come from? • Why can’t we say that NFL players have very high violent crime rate based on Benedict and Yaeger’s numbers? • The analyses in this paper are primarily based on arrest rates, not conviction rates. Discuss the pros of cons of using either indicator for criminal violence.
Criminal Violence in NFL Players • What would be a suitable comparison group for NFL players? Discuss whether the following factors should or should NOT be controlled for when we compare NFL players’ arrest rate to that of the general population? Why? • Race? • Earnings? • College graduation? • Age?
Sex and Sports:Comparing Progress Trajectories Bronze Medal Speeds in Olympics: 100m Sprint Speeds in 100M (m/s) 1896 2000 Men Women
Sex and Sports • Why is it incorrect to extrapolate from historical data of world records (figures 1-7) and conclude that women will eventually surpass men in running and swimming? • Wainer et al. compared the trajectories of bronze medal performance for men and women over decades and extracted two parameters from the comparisons. What are the two parameters and what did they signify according to the authors? (p. 9) • The sex gap in performance in swimming appears to be smaller than that in running. What might be the physiological and social explanations for that?
Sex and Sports • Find at least two criticisms of the Wainer et al. paper in Martin’s comment (p.16-17). • Find two places where Phillip Price (p.18-20) likes Wainer et al’s study. • Find two major criticisms of this study in Price’s comment.