1 / 18

Qunatitative Methods in Social Sciences (E774)

Qunatitative Methods in Social Sciences (E774). Sudip Ranjan BASU , Ph.D 20 November 2009. Regressions: Causal relationships. Source: S.R.Basu (2008): A new way to link development to institutions, policies and geography , United Nations, New York and Geneva. Lecture 10-Sudip R. Basu. 2.

orrick
Download Presentation

Qunatitative Methods in Social Sciences (E774)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Qunatitative Methods in Social Sciences (E774) Sudip Ranjan BASU,Ph.D 20 November 2009

  2. Regressions: Causal relationships Source: S.R.Basu (2008): A new way to link development to institutions, policies and geography, United Nations, New York and Geneva. Lecture 10-Sudip R. Basu 2

  3. Linear relationship To analyse how values of Y tend to change according to X values Y: response variable X: explanatory variable Y is a linear function of observations on x α –y intercept β-slope Equation is Models are simple approximation for reality Lecture 10-Sudip R. Basu 3

  4. Assumptions of statistical Inference The random sample is selected The mean of Y is related to X by the linear equation E(Y)=α+βX The conditional standard deviation σ is identical at each X-value, homoscedasticity The conditional distribution of Y at each value of X is normal Lecture 10-Sudip R. Basu 4

  5. Lest Squares Prediction Prediction equation: To estimate linear relationship of sample equation Estimate of coefficients of prediction equation Residuals as prediction error Residual: , for an observation (difference between an observed value and the predicted value of y Least square estimates a and b are the values that provide the prediction equation …. For which the residual sum of squares, , is a minimum Lecture 10-Sudip R. Basu 5

  6. Linear Regression Model Regression function is a mathematical function that describes how mean of Y changes according to the value of X β is regression coefficient σ Conditional standard deviation Estimate r2 of predicted equation Lecture 10-Sudip R. Basu 6

  7. Inferences for « Slope » Test of independence: H0: β =0 (variables are statistically independent) Test statistic: Standard error of b: P-value for Ha: β ≠0 , two-tail probability distribution from t-distribution Confidence interval for the slope: With degrees of freedom df=n-2 A small P-value for H0: β =0regression line has nonzero slope Lecture 10-Sudip R. Basu 7

  8. Inferences for « Correlation » Let r=0, like b=0 for sample Let ρ=0, like β =0 for population H0: ρ =0 –correlation are statistically independent Test statistic: Lecture 10-Sudip R. Basu 8

  9. Model Assumption and Violations Linear regression equation Extrapolation is dangerous Influential observations Factors influencing correlation Regression model with error terms Models and reality Lecture 10-Sudip R. Basu 9

  10. Some concepts Control variable-understanding influences of related variables Lurking variable-variable not measured in a model but that influences the association of interest Statistical interaction: It exists between X1 and X2 in their effects on Y when the true effect of one predictor on Y changes as the value of the other predictor changes Lecture 10-Sudip R. Basu 10

  11. Multiple Regressions Source: S.R.Basu (2008): A new way to link development to institutions, policies and geography, United Nations, New York and Geneva. Lecture 10-Sudip R. Basu 11

  12. Theory of Multiple Regression Model Multiple regression function (mrf) Slope in mrf describes the effect of an explanatory variable while controlling effects of other explanatory variables in the model β1 and β2are partial regression coefficients R-squared (0,1) Coefficient of multiple determinations If R2=1 If R2=0 Lecture 10-Sudip R. Basu 12

  13. Inference for multiple regression coefficients Testing collective influence of Xi Alternative hypothesis Test statistic-F distribution: Lecture 10-Sudip R. Basu 13

  14. Model Selection Procedures Selecting explanatory variables for a model Maximum R2 Backward elimination all significant coefficients Forward selection adding variables Stepwise regression drop variables if they loose their significance as other variables added Exploratory vs. Explanatory Research Lecture 10-Sudip R. Basu 14

  15. Regression Diagnostics Examine the residuals Plotting Residuals against Explanatory variables Lecture 10-Sudip R. Basu 15

  16. Detecting Influential Observations Remove Outliers Leverage is a nonnegative statistic such that larger its value, the greater weight that observation receives in determining DFFIT: effect on the fit of deleting observation The larger its absolute value, greater the influence that observation has on fitted values DFBETA: effect on the model parameter estimates of removing observation from dataset The larger the absolute value, the greater the influence of the observations on the parameter estimates Cook’s distance: effect that observation i has on all the predicted values Lecture 10-Sudip R. Basu 16

  17. Effects of multicollinearity Multicollinearity: Explanatory variables ‘overlap’ considerably and higher R2 values Multicollinearity inflates standard errors Variance inflation factor: multiplicative increase in the variance (squared se) of the estimator due to xj being correlated with other predictors Lecture 10-Sudip R. Basu 17

  18. Lecture 2-Sudip R. Basu

More Related