480 likes | 620 Views
Statistics Workshop B ivariate Descriptive Statistics J-Term 2009 Bert Kritzer. Describing Relationships Between Two Variables. Variables X Predictor (“independent”) X i as the value for the i th observation Y Response (“dependent”) variables
E N D
Statistics WorkshopBivariate Descriptive StatisticsJ-Term 2009Bert Kritzer
Describing Relationships Between Two Variables • Variables • X Predictor (“independent”) • Xi as the value for the ith observation • Y Response (“dependent”) variables • Yias the value for the ith observation • Depends on nature of two variables (e.g., two nominal, two interval, etc.) • Simple table • Percentages: the right and the wrong way • Difference of means or medians • Multiple boxplots • Regression: fitting a line through a “scatterplot” of points (Xis and Yis) • Correlation: Measuring the strength of the relationship
Data Spreadsheet Paired Values
Simple “Crosstabulation”Trust in the Police Question: How much of the time do you think you can trust the local police?
How Not To Do Percentages Source: Sarver, Kaheny, & Szmer, The Attorney Gender Gap in U.S. Supreme Court Litigation, 91 Judicature 238, 248 (2008).
Graphics for CrosstabulationsMultiple Pie Charts Trust in the Police by Race
Feeling Thermometer Source: http://www.laits.utexas.edu/txp_media/html/poll/features/feeling/slide1.html (visited September 4, 2008)
FT-SCOTUSMeans with Standard Deviation Bars Note: Red dots represent mean; lines go one standard deviation above and below the mean.
The Regression Line The Line: An Observation: A Prediction: eiis the difference between the actual observed value, Yi, and the value of Y on the line that corresponds with Xi
Fitting the Line • Eyeball • Split medians • Minimize sum of errors • Minimize sum of absolute errors • Minimize sum of squared errors (“Least Squares”)
The Fitted Regression Line For every ten point increase in citizen liberalism, one less tort reform was adopted Y = 12.89 – 0.10X
Correlation • Measure of association; strength of relationship • Range: 0 to 1 or -1 to 0 to +1 • Proportional reduction in error (“PRE”) • Determining a prediction method • Setting a baseline • Non-PRE correlation coefficients
Product Moment Correlation Traditional formula for r:
Other Ways of Computing r Cope’s Method Traditional Method Sum values of X and sum the values of Y to get ΣX and ΣY Compute the square of X and Y Sum the values of X2 and sum the values of Y2 to get ΣX2 and ΣY2 Multiple together each pair of values for X and Y to get the product XY Sum the values of the product XY to get ΣXY Use the values in the formula below to get r
Time Plots: Showing Change Over TimeWomen Law Graduates & Women SC Clerks
eta2FT-SCOTUS by Ideology Baseline = 332462.11 Alternative = 316208.24 eta2= (332462.11 – 316208.24)/332462.11=.049 eta = .221
Moving Beyond Two Variables Tort Reform by Citizen Liberalism & Elite Liberalism
Multiple Regression ONE PREDICTOR: TWO PREDICTORS: or k PREDICTORS: or
Multiple Regression: Tort Reform by Citizen and Elite Liberalism TortReformIndex = 13.032 – 0.062∙Citizen – 0.040∙Elite R2 = 0.264
With Illustrations by the Author, A SQUARE(Edwin A. Abbott 1838-1926) 1884
Descriptive Statistics: Summary • Summarize and describe data • Univariate • Central tendency & dispersion • Distribution • Bivariate • Describe the relationship • Degree of relationship • Multivariate