270 likes | 288 Views
Learn about simple linear regression, including assumptions and statistical solutions, as well as multiple linear regression with matrix form modeling. Explore SPSS examples, interpretation of results, including regression coefficients, interactions, and the use of dummy variables. Improve your understanding of regression analysis techniques.
E N D
Introduction • Simple regression = relationship between two variables • Multiple regression = relationship between more variables • Dependent variable (Outcome, explained variable) • Independent variable (predictor, explanatory variable) • Dependent variable: cardinal (scale) • Independent variable(s): cardinal or binary (but see dummy variables later)
Introduction • Main goal: explain dependent variable by independent variable(s) • Assumption: relationship between dependent variable and independent variable(s) is linear (can be described by line – example) • Statistical solution: find the equation for relationship and describe it
Simple lin. regression • Cardinal dependent variable and one cardinal independent variable • Assumption: relationship between dependent variable and independent variable is linear • Example in SPSS (chart and fit line): Graphs-Chart builder-Scatter/Dot (Add Fit Line at Total) • Reco: Every time before any regression computation use chart
Some details • How to fit the „best“ line? • What is the meaning of regression equation? • Is my regression good enough? • How to improve my regression?
Ordinary least squares (OLS) • Find the „best“ line? try to minimize sum of squares
Ordinary least squares (OLS) • residual = difference between real value of dep. var. and estimate by reg. line • b0 = intercept (intersection of regression line and Y axis, value of dep. var. if independent is zero) • b1 = regression coeff./slope/gradient (average increase/decrease for unit change in indep. var)
SR results • R – correlationbetweenrealvaluesof dep. var. and estimates by reg. line • R2 – square of R, Reco: multiply by 100 and interpret in % as percentageofexplained variance (measurestrenghtifrelationship) • Slopeand intercept • T-test (is the relationship expected in the population, canbegeneralized to thepopulation?) • Example in SPSS
Multiple regressionData matrix (indep. variables X) X1 X2 X3 X4 ETC. YES 204 M 1,2 NO 180 F 4,3 NO 178 F 2,3 NO 187 M 3,8 YES 192 M 2,6 . . ETC.
Multiple regressionVector y Y 135 112 135 187 189 ETC.
Multiple regressionVector β(regression coeffs.) β0intercept β1reg. coeff. for the 1st var β2reg. coeff. for the 2nd var Β3 reg. coeff. for the 3rd var ETC.
Multiple linear regression model • Model for population y = 0 + 1x1 + 2x2 + . . . + pxp+ • Regression equation for population E(y) = 0 + 1x1 + 2x2 + . . . + pxp • Estimate of reg. eq. (based on sample) y = b0 + b1x1 + b2x2 + . . . + bpxp
Model in matrix form • Model: y = βX + ε
Estimates by OLS What does it mean? Excursus of vector algebra https://www.mathsisfun.com/algebra/matrix-multiplying.html https://www.khanacademy.org/math/precalculus/precalc-matrices/ multiplying-matrices-by-matrices/v/multiplying-a-matrix-by-a-matrix
Multiple regression in SPSS • Result in SPSS: regression equation of line, plane or overplane, statistical test for model and coeffs. and regression diagnostics (see next week) • Menu in SPSS – basic options
Syntax in SPSS Syntax for stepwise and selected outputs REGRESSION /MISSING LISTWISE /STATISTICS COEFF OUTS R ANOVA /CRITERIA=PIN(.05) POUT(.10) /NOORIGIN /DEPENDENT Y /METHOD= STEPWISE X1 X2 X3.
SPSS outputs for regression • Example of multiple regression in SPSS • Interpretation of results: ANOVA table,T-tests, R, R2, R2Adj. • Regression coeffs. : interpretation (ceteris paribus principle) • Beta coeffs.: comparison of individual imnpact of variables (regression coeff. for standardized data) • What is standardization?
Regression in SPSS • Selection of variables - forward, backward, stepwise (principals) • Stepwise is very often used but not recommended by statisticians (Why?) • Predicted outcomes from regression • Residuals: meaning and saving
Dummy variables • Possibilty to use nominal and/or ordinal variables as independent variables in regression model by a set of dummy variables • Basic rule – number of dummy variables is computed: number of categories-1 • „omitted category" is called referenece category – all other categories will be compared to this category (example in SPSS) • Reco: use category with the lowest level of dependent variable as reference (all coeffs will be positive and your interpretation would be simple)
Dummy – more info • Principle of dummy variables can be applied in other techniques (e.g. logistic regression) • Omitted last category is applied in special statistical techniques (loglinear models, logit models) • Some procedures in SPSS will create dummy by default (e.g. procedures for logistic regressions)
Interactions • Combine two or more variables into one new variable • Necessary to prepare in data • Why use interactions? A) joint effect of two variables (synergy) B) solve different relationships in groups • Example – two way interaction (two variables), one cardinal and one binary • Interactions by picture and practical application in SPSS