280 likes | 424 Views
Introduction to Statistics: Political Science (Class 3). Calculating R-Squared, Dichotomous and Nominal Variables, F-tests. R-Squared. Coef. St.Err T P Bush FT -.090 .489 -0.18 0.860 Party Identification 12.31 7.47 1.65 0.143
E N D
Introduction to Statistics: Political Science (Class 3) Calculating R-Squared, Dichotomous and Nominal Variables, F-tests
Coef. St.Err T P Bush FT -.090 .489 -0.18 0.860 Party Identification 12.31 7.47 1.65 0.143 Constant 50.16 18.23 2.75 0.028 10 Random Cases R2 = .5336 R-Squared Example • Measure of proportion of variance in Y explained by the IVs Coef. St.Err T P Bush FT -.165 .019 -8.72 0.000 Party Identification 7.354 .278 26.44 0.000 Constant 65.28 .962 67.89 0.000 FULL SAMPLE
Obama FT = 50.16 + (-.090)(Bush FT) + 12.31(Party Identification) • First, we need the variance of Y • Mean = 66, so:
Variance of Y = 6190 = .5336
What is a “good” R2? • Predict feelings about Obama with: • Party ID and feelings about Bush • Education • Zodiac sign
Non-continuous IVs Dealing with Dichotomous and Nominal Variables
Democratic Peace • Is sum of democracy scores the right measure? • Alternative: Are the pair of countries both democracies? • Indicator/dummy/dichotomous variable: • 1 if both countries have democracy scores >5 • 0 otherwise
Coef SE Coef T P Democratic Pair (1=yes) 5.18 0.362 14.31 0.000 Constant 24.35 0.171 142.45 0.000 R-squared = 0.0057 Coef SE Coef T P Democratic Pair (1=yes) 4.74 0.369 12.84 0.000 Military Spending ($mil) 0.053 0.002 25.59 0.000 Constant 22.21 0.204 108.98 0.000 R-squared = 0.0242 Dichotomous IV DV: Years at peace
Nominal variables • Speed dating survey: You have 100 points to distribute among the following attributes -- give more points to those attributes that are more important in a potential date, and fewer points to those attributes that are less important in a potential date. • Attractive • Fun • Intelligent • Sincere • Ambitious • Shared Interests
How do people’s perspective/goals affect what’s important to them? • What is your primary goal in participating in this event? • Seemed like a fun night out=1 • To meet new people=2 • To get a date=3 • Looking for a serious relationship=4 • To say I did it=5 • Does this make sense as a linear scale?
Who is likely to say each of the following is important? • Attractiveness? Fun? • Seemed like a fun night out=1 • To meet new people=2 • To get a date=3 • Looking for a serious relationship=4 • To say I did it=5 • Does this make sense as a linear scale?
Effects of Nominal Variable • One Variable: • Seemed like a fun night out=1 • To meet new people=2 • To get a date=3 • Looking for a serious relationship=4 • To say I did it=5 • Five Variables: • Seemed like a fun night out (1=yes) • To meet new people (1=yes) • To get a date (1=yes) • Looking for a serious relationship (1=yes) • To say I did it (1=yes)
Importance of Attribute = β0 + β1(Seemed Fun) + β2(Meet People) + β3(Date) + β4(Serious Relationship) + β5(Say Did) + u What would β0 correspond to in this model?
“Reference Group” • Leave one indicator out Importance of Attribute = β0 + β1(Seemed Fun) + β2(Meet People) + β3(Date) + β4(Serious Relationship) + β5(Say Did) + u
(Remember: reference group is “to say I did it”) What if we want to know whether people who want a date and those who want a serious relationship differ in how important they think attractiveness is?
Easiest way: change reference category Importance of Attribute = β0 + β1(Seemed Fun) + β2(Meet People) + β3(Date) + β4(Serious Relationship) + β5(Say Did) + u Do people who want a date and those who want a serious relationship differ in how important they think attractiveness is?
Nominal andDichotomous IVs Estimated points allocated to attractiveness for men who attended because it seemed fun?
F-Tests Testing the joint significance of variables
F-test • Way of testing joint significance of variables – i.e., whether set of variables significantly improve explanatory power • When to use: • Nominal variables • Variables likely to be highly correlated, but important predictors
Terminology • Unrestricted model – includes IVs you want to test joint significance of • Restricted model – same model, excluding IVs to be tested • SSR – Sum of Squared Residuals
Formula • q = # of variables being tested • n = number of cases • k = number of IVs in unrestricted
Who values fun people? What if we want to know whether the reason for attending variables as a group improve the explanatory power of the model?
q = # of variables being tested n = number of cases k = number of IVs in unrestricted = 9.25
Statistical significance of F-test • What does an F value of 9.25 mean? • Similar idea to a t-test, but shape of F-distribution depends (heavily) on degrees of freedom • Numerator = number of IVs being tested • Denominator = N-(number of IVs)-1 • Here: 4 and 2478 (2484-5-1)
Look up critical value in a table or use Minitab • Calc Probability Distributions F Cumulative Distribution Function F distribution with 4 DF in numerator and 2478 DF in denominator x P( X <= x ) 9.25 1.00000 Note: this will give you area under the curve up to your F-test, so use 1-p
Notes and Next Time • Graded homework will be handed back next time and model answers will be posted online early next week • New homework will be handed out next time (and due next Thursday) • Next time: • Functional form in multivariate regression