1 / 13

Dummy Variables; Multiple Regression July 21, 2008

Dummy Variables; Multiple Regression July 21, 2008. Ivan Katchanovski , Ph.D. POL 242Y-Y. Dummy Variables. Dummy variable: a variable that includes two categories which assume values 1 and 0 Very useful in regression analysis

mead
Download Presentation

Dummy Variables; Multiple Regression July 21, 2008

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Dummy Variables; Multiple RegressionJuly 21, 2008 Ivan Katchanovski, Ph.D. POL 242Y-Y

  2. Dummy Variables • Dummy variable: a variable that includes two categories which assume values 1 and 0 • Very useful in regression analysis • Nominal and ordinal variables can be transformed into dummy variables • Example: “gender”=nominal variable • Transformed into dummy variable “female”: • female=1 • male=0

  3. Multiple Regression • Multiple Regression: Assesses effects of many independent variables on the dependent variable • Widely used in political science research • Multiple Regression Formula: Y = a + b1X1 + b2X2... +bkXk • Y = the value of the dependent variable • a = constant or the Y intercept • bi = the regression coefficient, the partial slope of the regression line • the amount of change produced in the dependent (Y) by a unit change in an independent variablekeeping other independent variables constant • Xi = the value of the independent variable • K = the number of the independent variables

  4. Standardized Regression Coefficient (Beta) • Standardized Regression Coefficient (Beta): The slope of the relationship between a particular independent variable and the dependent variable when all scores have been normalized • change in the dependent variable (Y) expressed in standard deviations (s) and produced from a change of one standard deviation in an independent variable • Useful in comparing relative effects of independent variables which are measured in different units • Canadian dollars, years, etc.

  5. Statistical Significance • Statistical significanceof unstandardized regression coefficient (bi): • Statistically significant if p(obtained)<p(critical)=.05 or .01 or .001 • Statistically nonsignificant if p(obtained)>p(critical)=.05 • Direction of association should be reported only for statistically significant regression coefficients • Statistical significance of regression: • Statistically significant if for F-statistic p(obtained)<p(critical)=.05 or .01 or .001

  6. Coefficient of Multiple Determination (R square) • Coefficient of Multiple Determination (RSquare): • The total variation explained in the dependent variable by all independent variables combined • Ranges between 0 (no association) and 1 (perfect association) • Adjusted Coefficient of Multiple Determination: • R square adjusted for the number of the independent variables • Preferable to non-adjusted R square in multiple regression • Ranges between 0 (no association) and 1 (perfect association)

  7. Example: Multiple Research Hypotheses • First Research Hypothesis: The level of economic development has a positive effect on the level of democracy • Second Research Hypothesis: Former British colonies are more likely to be democratic compared to other countries • Third Research Hypothesis: Protestant countries are more likely to be democratic compared to other countries • Dataset: World

  8. Example: Variables • Dependent Variable: • Freedom House democracy rating reversed: • Interval-ratio • Independent Variables: • GDP per capita ($1000) • Interval-ratio • Former British colony • Dummy variable: Yes (British colony)=1; No (Not British colony)=0 • Protestant country • Dummy variable: Yes (Protestant)=1; No (All Other)=0

  9. Example: Regression Coefficients • Unstandardized Regression Coefficient of GDP per capita variable=.217 • Increase of $1000 in the level of GDP per capita increases the democracy score on a scale from 1 to 7 by .217 • Unstandardized Regression Coefficient of the British colony variable=.045 • The average former British colony has democracy score which is .045 units higher compared to other countries • Unstandardized Regression Coefficient of the Protestant country variable=-.054 • The average Protestant country has democracy score which is .054 units lower compared to non-Protestant countries

  10. Example: Standardized Regression Coefficients • Standardized Regression Coefficient of GDP per capita variable=.612 • Standardized Regression Coefficient of the British colony variable=.012 • Standardized Regression Coefficient of the Protestant country variable=-.012 • GDP per capita variable has much bigger effect on the level of democracy compared to the effects of the British colony variable and the Protestant country variable

  11. Example: Statistical Significance • Number of cases: N=111 • .1 or 10% significance level can be used • Regression coefficient of the GDP variable: • SPSS: p(obtained)=.000 <p(critical)=.001=.1% • Statistically significant at the .001 or .1% level • Regression coefficient of the British colony variable: • SPSS: p(obtained)=.878>p(critical)=.1 • Statistically insignificant • Regression coefficient of the Protestant country variable: • SPSS: p(obtained)=.890>p(critical)=.1 • Statistically insignificant

  12. Example: Interpretation • Adjusted R square=.351 • GDP per capita, British colony, and Protestant country variables explain 35.1% of variation in the Freedom House democracy scale • The first research hypothesis is supported by multiple regression analysis • The level of economic development has a positive and statistically significant effect on democracy • The second and the third research hypotheses are not supported by multiple regression analysis

  13. Limitations of Multiple Regression • Correlation is not always causation • Assumes linear relationship between variables • Omitted variables problem: • Potentially relevant factors are not included in multiple regression • Multicollinearity problem: • Two independent variables are very strongly correlated (correlation coefficient is higher than .80) • Possible solution: exclude one of these independent variables from multiple regression

More Related