150 likes | 711 Views
Dummy Variables; Multiple Regression July 21, 2008. Ivan Katchanovski , Ph.D. POL 242Y-Y. Dummy Variables. Dummy variable: a variable that includes two categories which assume values 1 and 0 Very useful in regression analysis
E N D
Dummy Variables; Multiple RegressionJuly 21, 2008 Ivan Katchanovski, Ph.D. POL 242Y-Y
Dummy Variables • Dummy variable: a variable that includes two categories which assume values 1 and 0 • Very useful in regression analysis • Nominal and ordinal variables can be transformed into dummy variables • Example: “gender”=nominal variable • Transformed into dummy variable “female”: • female=1 • male=0
Multiple Regression • Multiple Regression: Assesses effects of many independent variables on the dependent variable • Widely used in political science research • Multiple Regression Formula: Y = a + b1X1 + b2X2... +bkXk • Y = the value of the dependent variable • a = constant or the Y intercept • bi = the regression coefficient, the partial slope of the regression line • the amount of change produced in the dependent (Y) by a unit change in an independent variablekeeping other independent variables constant • Xi = the value of the independent variable • K = the number of the independent variables
Standardized Regression Coefficient (Beta) • Standardized Regression Coefficient (Beta): The slope of the relationship between a particular independent variable and the dependent variable when all scores have been normalized • change in the dependent variable (Y) expressed in standard deviations (s) and produced from a change of one standard deviation in an independent variable • Useful in comparing relative effects of independent variables which are measured in different units • Canadian dollars, years, etc.
Statistical Significance • Statistical significanceof unstandardized regression coefficient (bi): • Statistically significant if p(obtained)<p(critical)=.05 or .01 or .001 • Statistically nonsignificant if p(obtained)>p(critical)=.05 • Direction of association should be reported only for statistically significant regression coefficients • Statistical significance of regression: • Statistically significant if for F-statistic p(obtained)<p(critical)=.05 or .01 or .001
Coefficient of Multiple Determination (R square) • Coefficient of Multiple Determination (RSquare): • The total variation explained in the dependent variable by all independent variables combined • Ranges between 0 (no association) and 1 (perfect association) • Adjusted Coefficient of Multiple Determination: • R square adjusted for the number of the independent variables • Preferable to non-adjusted R square in multiple regression • Ranges between 0 (no association) and 1 (perfect association)
Example: Multiple Research Hypotheses • First Research Hypothesis: The level of economic development has a positive effect on the level of democracy • Second Research Hypothesis: Former British colonies are more likely to be democratic compared to other countries • Third Research Hypothesis: Protestant countries are more likely to be democratic compared to other countries • Dataset: World
Example: Variables • Dependent Variable: • Freedom House democracy rating reversed: • Interval-ratio • Independent Variables: • GDP per capita ($1000) • Interval-ratio • Former British colony • Dummy variable: Yes (British colony)=1; No (Not British colony)=0 • Protestant country • Dummy variable: Yes (Protestant)=1; No (All Other)=0
Example: Regression Coefficients • Unstandardized Regression Coefficient of GDP per capita variable=.217 • Increase of $1000 in the level of GDP per capita increases the democracy score on a scale from 1 to 7 by .217 • Unstandardized Regression Coefficient of the British colony variable=.045 • The average former British colony has democracy score which is .045 units higher compared to other countries • Unstandardized Regression Coefficient of the Protestant country variable=-.054 • The average Protestant country has democracy score which is .054 units lower compared to non-Protestant countries
Example: Standardized Regression Coefficients • Standardized Regression Coefficient of GDP per capita variable=.612 • Standardized Regression Coefficient of the British colony variable=.012 • Standardized Regression Coefficient of the Protestant country variable=-.012 • GDP per capita variable has much bigger effect on the level of democracy compared to the effects of the British colony variable and the Protestant country variable
Example: Statistical Significance • Number of cases: N=111 • .1 or 10% significance level can be used • Regression coefficient of the GDP variable: • SPSS: p(obtained)=.000 <p(critical)=.001=.1% • Statistically significant at the .001 or .1% level • Regression coefficient of the British colony variable: • SPSS: p(obtained)=.878>p(critical)=.1 • Statistically insignificant • Regression coefficient of the Protestant country variable: • SPSS: p(obtained)=.890>p(critical)=.1 • Statistically insignificant
Example: Interpretation • Adjusted R square=.351 • GDP per capita, British colony, and Protestant country variables explain 35.1% of variation in the Freedom House democracy scale • The first research hypothesis is supported by multiple regression analysis • The level of economic development has a positive and statistically significant effect on democracy • The second and the third research hypotheses are not supported by multiple regression analysis
Limitations of Multiple Regression • Correlation is not always causation • Assumes linear relationship between variables • Omitted variables problem: • Potentially relevant factors are not included in multiple regression • Multicollinearity problem: • Two independent variables are very strongly correlated (correlation coefficient is higher than .80) • Possible solution: exclude one of these independent variables from multiple regression