240 likes | 350 Views
Linear Regression. Hein Stigum Presentation, data and programs at: http://folk.uio.no/heins/ courses. Linear regression. Concepts. Outcome and regression types. Numerical data Discrete number of partners Continuous Weight Categorical data Nominal disease/ no disease Ordinal
E N D
Linear Regression Hein Stigum Presentation, data and programs at: http://folk.uio.no/heins/courses H.S.
Linear regression Concepts H.S.
Outcome and regression types • Numerical data • Discrete • number of partners • Continuous • Weight • Categorical data • Nominal • disease/ no disease • Ordinal • small/ medium/ large • Poisson regression • Linear regression • Logistic regression • Ordinal regression H.S.
Regression idea H.S.
Measures and Assumptions • Adjusted effects • b1 is the increase in weight per day of gestational age • b1 is adjusted for b2 • Assumptions • Independent errors • Linear effects • Constant error variance • Robustness • influence H.S.
Workflow • DAG • Plots: distribution and scatter • Bivariate analysis • Regression • Model estimation • Test of assumptions • Independent errors • Linear effects • Constant error variance • Robustness • Influence Discuss Plot Plot H.S.
Continuous outcome:Linear regression, Birth weight Analysis H.S.
C2 parity C1 sex E gest age D birth weight DAGs Associations Bivariate (unadjusted) Causal effects Multivariable (adjusted) Draw your assumptions before your conclusions H.S.
Plot outcome by exposure Effects on linear regression: OK Be clear on the research question: overall birth weight: linear regression low birth weight: logistic regression linear and logistic can give opposite results May lead to non-constant error variance May have high influential outliers H.S.
Plot outcome by exposure, cont. Linear effects? Yes H.S.
Bivariate analysis Outcome: birthweight H.S.
Continuous outcome:Linear regression, Birth weight Regression H.S.
2 categories OK, but know the coding 3+ categories Use “dummies” “Dummies” are 0/1 variables used to create contrasts Want 3 categories for parity: 0, 1 and 2-7 children Choose 0 as reference Make dummies for the two other categories Categorical covariates generate Parity1 = (parity==1) if parity<. generate Parity2_7 = (parity>=2) if parity<. H.S.
Model estimation Syntax: regress weight gest sex Parity1 Parity2_7 H.S.
Create meaningful constant Expected birth weight at: gest= 0, sex=0, parity=0 gest=280, sex=1, parity=0 Alternative: center variables gen gest280=gest-280gest280 has a meaningful zero at 280 days gen sex0=sex-1 sex0 has a meaningful zero at boys
Model results H.S.
Test of assumptions • Discuss • Independent residuals? • Plot residuals versus predicted y • Linear effects? • constantvariance? H.S.
Violations of assumptions • Dependent residuals Use linear mixed models • Non linear effects Add square term Or use piecewise linear • Non-constant variance Use robust variance estimation H.S.
Influence H.S.
Measures of influence • Measure change in: • Predicted outcome • Deviance • Coefficients (beta) • Delta beta Remove obs 1, see change remove obs 2, see change H.S.
Delta beta for gestational age If obs nr 539 is removed, beta will change from 6 to 16 H.S.
Removing outlier Full data Outlier removed One outlier affected two estimates Final model H.S.
Summing up • DAGs • Guide analysis • Plots • Unequal variance, non-linearity, outliers • Bivariate analysis • Linear regression • Fit model • Check assumptions • Check robustness • Make meaningful constant H.S.