750 likes | 884 Views
Modeling the association between a binary outcome, Y, and an “exposure”, X. Slides are from Research Professor M. Thompson. We might want to model p x =P(Y=1|X). What are the characteristics of p X ? 0 ≤p X ≤ 1 p X possibly monotone in X. 5. Logit. Probit. Transform of p. 0. -5.
E N D
Modeling the association between a binary outcome, Y, and an “exposure”, X Slides are from Research Professor M. Thompson BIOST 536 Thompson
We might want to model px=P(Y=1|X) What are the characteristics of pX? • 0 ≤pX≤ 1 • pX possibly monotone in X BIOST 536 Thompson
5 Logit Probit Transform of p 0 -5 0 .2 .4 .6 .8 1 Probability Model g(pX)=β0 + β1 X BIOST 536 Thompson
Logistic regression with a single binary risk factor BIOST 536 Thompson
Cohort or Cross-sectional study estimates P(Y=1 | X=1) estimates P(Y=1 | X=0) estimatesthe odds ratio: BIOST 536 Thompson
Under the logistic model: logit(P(Y=1|X))=β0+β1X ln(OR) = ln(Ψ) = logit(P(Y=1|X=1))-logit(P(Y=1|X=0)) = β0 + β1 - β0 = β1 i.e. Ψ = exp(β1) And: logit(P(Y=1 |X=0)) = β0 P(Y=1 |X=0) is estimated by BIOST 536 Thompson
The logistic equations: For binary X: BIOST 536 Thompson
Case Control study Let Z = 1 if individual was sampled = 0 otherwise Define π1 = P(Z=1 | Y=1); π0 = P(Z=1 | Y=0) Let pZ(X)= P(Y=1 | X, Z=1) BIOST 536 Thompson
We can model: Logit(pZ(X)) BIOST 536 Thompson
If we model logit(pZ(X)) = α + β1 X Then ln(Ψ) = β1 or Ψ = exp(β1) as before. But: BIOST 536 Thompson
Parameter estimation:Maximum Likelihood We choose that estimate of the parameters that makes the data most likely to have occurred Let's take the simple setting of a cross-sectional study where we want to estimate the prevalence of a disease. Say we take a random sample of N individuals and w of them have the disease. The common sense estimate of the prevalence of disease is : BIOST 536 Thompson
The likelihood Let w=number diseased in N independent individuals and let the true disease prevalence in the population be p. Then the likelihood of observing w diseased individuals in N is given by: BIOST 536 Thompson
We want to choose that value of p which maximizes the likelihood or, equivalently, the log of the likelihood: Taking the derivative of l with respect to p: Setting the derivative equal to zero and solving for p: BIOST 536 Thompson
In a study involving 53 men with prostate cancer, 20 of the men had nodal involvement How to estimate the chance of nodal involvement? BIOST 536 Thompson
Using MLE in the logistic regression setting with a single covariate, X: Say we have N observations (Yi, Xi ), i=1,2,…,N, where Y denotes disease status (0 =non-diseased, 1=diseased) and X is a risk factor of interest. Let p(X) denote P(Y=1 | X). Then: BIOST 536 Thompson
L= l =ln(L) = Alternative (Binomial) formulation: If X takes on n different values, Xj, j=1,2,…,n, and, for each Xj, there are nj subjects, where , of whom yj are “diseased”, we can represent the log likelihood as BIOST 536 Thompson
If we model then, for a single dichotomous risk factor, X, as in Table A, the maximum likelihood estimate of β0 is ln(b/d) β1 is ln(ad/bc) and hence the maximum likelihood estimate of P(Y=1 | X=1) is a/m1 and of P(Y=1 | X=0) is b/m0. BIOST 536 Thompson
Hypothesis testing and confidence intervals Say we want to establish whether tumor size affects the chance of nodal involvement in men with prostate cancer Nodal | Tumor involvement| largesmall| Total -----------+----------------------+---------- Yes | 15 5 | 20 | 56% 19% | 38% -----------+----------------------+---------- No | 1221 | 33 | 44% 81% | 62% -----------+----------------------+---------- Total | 26 27 | 53 BIOST 536 Thompson
Consider logit(P(nodal involvement | tumor size=X))=β0 + β1 X The maximum likelihood estimate of β1 is Hence the OR is estimated by e1.66= 5.25 (=15x21/(5x12)) How do we test the statistical significance of the OR? Calculate a confidence interval? BIOST 536 Thompson
Ho: β1=0 <=> Ho: OR=Ψ=1 BIOST 536 Thompson
The deviance compares observed to predicted values via the likelihood: where To assess the role of X in the logistic model : Logit(P(Y=1|X))= β0 + β1 X We can consider G = D(model without X)-D(model with X) = BIOST 536 Thompson
Let Y=nodal involvement in prostate cancer, X=tumor size We estimate: logit(P(Y=1|X)= -1.44+1.66 X, and OR=Ψ=5.25 Ln L= -31.276 Under the null model: Logit(P(Y=1))=constant, then Ln L=-35.126 Under the hypothesis H0 : β1 =0, G has a Χ2 distribution with 1 degree of freedom Here G =-2*(-35.126+31.276) = 7.7 LR test: P(Х21 > 7.7)= .0055 Score Test: P(Х21 > 7.44)= .0064 Wald test: P(Х21 > 6.92)= .0090 STATA gives the LR test for the fitted model versus the null model STATA does not do the Score test easily STATA gives the single parameter Wald test BIOST 536 Thompson
Stata code • . logistic node tumor • Logistic regression Number of obs = 53 • LR chi2(1) = 7.70 • Prob > chi2 = 0.0055 • Log likelihood = -31.276312 Pseudo R2 = 0.1096 • ------------------------------------------------------------------------------ • node | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] • -------------+---------------------------------------------------------------- • tumor | 5.25 3.310487 2.63 0.009 1.52552 18.06761 • ------------------------------------------------------------------------------ • . logit • ------------------------------------------------------------------------------ • node | Coef. Std. Err. z P>|z| [95% Conf. Interval] • -------------+---------------------------------------------------------------- • tumor | 1.658228 .630569 2.63 0.009 .4223355 2.894121 • _cons | -1.435085 .4976116 -2.88 0.004 -2.410385 -.4597837 • ------------------------------------------------------------------------------ • Pseudo R2=1-lm/l0 BIOST 536 Thompson
The information matrix Maximum likelihood theory states that the variance estimators for estimates obtained from MLE can be derived from the matrix of second partial derivatives of the log likelihood. Minus this matrix is called the information matrix, I, and the estimated variances and covariances of the parameter estimates are obtained from the inverse of the matrix. BIOST 536 Thompson
Let and β and let V= BIOST 536 Thompson
Then I = X' V X and it can be shown that ~N(β, I-1) and so an approximate 95% CI for, e.g., β1 is given by: and hence a 95% CI for the OR is obtained by exponentiation of the CI for β1 BIOST 536 Thompson
Interpretation of coefficients Dichotomous X (coded 0 or 1) Here OR = or Interpretation of β0 depends on study design. BIOST 536 Thompson
Polytomous X BIOST 536 Thompson
Polytomous X with k categories We define X1, X2, …, Xk-1 dummy 0-1 design variables and consider the model: P(Y=1 | X) = β0 + β1 X1 + β2 X2 + … βk-1 Xk-1 . is the odds ratio for the j'th category of X relative to the baseline category. BIOST 536 Thompson
Stata code: . input chd smoke count . 1 3 39 . 1 2 50 . 1 1 70 . 1 0 98 . 0 3 253 . 0 2 355 . 0 1 735 . 0 0 1554 . end BIOST 536 Thompson
. xi: logit chd i.smoke [fweight = count] i.smoke _Ismoke_0-3 (naturally coded; _Ismoke_0 omitted) Iteration 0: log likelihood = -890.62187 Iteration 1: log likelihood = -876.52013 Iteration 2: log likelihood = -875.84853 Iteration 3: log likelihood = -875.84738 Logistic regression Number of obs = 3154 LR chi2(3) = 29.55 Prob > chi2 = 0.0000 Log likelihood = -875.84738 Pseudo R2 = 0.0166 ------------------------------------------------------------------------------ chd | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- _Ismoke_1 | .4122448 .1627693 2.53 0.011 .0932229 .7312667 _Ismoke_2 | .8035253 .1834786 4.38 0.000 .4439138 1.163137 _Ismoke_3 | .8937922 .2010989 4.44 0.000 .4996455 1.287939 _cons | -2.76362 .1041517 -26.53 0.000 -2.967754 -2.559486 ------------------------------------------------------------------------------ BIOST 536 Thompson
. xi: logistic chd i.smoke [fweight=count] i.smoke _Ismoke_0-3 (naturally coded; _Ismoke_0 omitted) Logistic regression Number of obs = 3154 LR chi2(3) = 29.55 Prob > chi2 = 0.0000 Log likelihood = -875.84738 Pseudo R2 = 0.0166 ------------------------------------------------------------------------------ chd | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- _Ismoke_1 | 1.510204 .2458148 2.53 0.011 1.097706 2.077711 _Ismoke_2 | 2.2334 .4097812 4.38 0.000 1.558796 3.199955 _Ismoke_3 | 2.444382 .4915626 4.44 0.000 1.648137 3.625307 ------------------------------------------------------------------------------ BIOST 536 Thompson
. expand count (3146 observations created) . xi: logit chd i.smoke Logistic regression Number of obs = 3154 LR chi2(3) = 29.55 Prob > chi2 = 0.0000 Log likelihood = -875.84738 Pseudo R2 = 0.0166 ------------------------------------------------------------------------------ chd | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- _Ismoke_1 | .4122448 .1627693 2.53 0.011 .0932229 .7312667 _Ismoke_2 | .8035253 .1834786 4.38 0.000 .4439138 1.163137 _Ismoke_3 | .8937922 .2010989 4.44 0.000 .4996455 1.287939 _cons | -2.76362 .1041517 -26.53 0.000 -2.967754 -2.559486 ------------------------------------------------------------------------------ . xi: logistic chd i.smoke ------------------------------------------------------------------------------ chd | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- _Ismoke_1 | 1.510204 .2458148 2.53 0.011 1.097706 2.077711 _Ismoke_2 | 2.2334 .4097812 4.38 0.000 1.558796 3.199955 _Ismoke_3 | 2.444382 .4915626 4.44 0.000 1.648137 3.625307 ------------------------------------------------------------------------------------------------------------- BIOST 536 Thompson
. lincom _Ismoke_2- _Ismoke_1, or ( 1) - _Ismoke_1 + _Ismoke_2 = 0 ------------------------------------------------------------------------------ chd | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- (1) | 1.478873 .2900367 2.00 0.046 1.006916 2.172044 ------------------------------------------------------------------------------ . lincom _Ismoke_3- _Ismoke_2, or ( 1) - _Ismoke_2 + _Ismoke_3 = 0 ------------------------------------------------------------------------------ chd | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- (1) | 1.094466 .2505588 0.39 0.693 .698771 1.714234 ------------------------------------------------------------------------------ . lincom _Ismoke_3- _Ismoke_1, or ( 1) - _Ismoke_1 + _Ismoke_3 = 0 ------------------------------------------------------------------------------ chd | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- (1) | 1.618577 .3442644 2.26 0.024 1.066809 2.455728 ------------------------------------------------------------------------------ BIOST 536 Thompson
Continuous X Here interpretation of β1 depends on the units of X. If the logit is linear in X, then β1 represents the change in log odds for a 1 unit increase in X. is the odds ratio corresponding to a 1 unit increase in X. BIOST 536 Thompson
Example: Effect of age on nodal involvement in prostate cancer . logistic node age Logit estimates Number of obs = 53 LR chi2(1) = 1.09 Prob > chi2 = 0.2965 Log likelihood = -34.581125 Pseudo R2 = 0.0155 ------------------------------------------------------------------------------ node | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] ---------+-------------------------------------------------------------------- age | .9526993 .0445086 -1.037 0.300 .8693389 1.044053 ------------------------------------------------------------------------------ . logit ------------------------------------------------------------------------------ node | Coef. Std. Err. z P>|z| [95% Conf. Interval] ---------+-------------------------------------------------------------------- age | -.048456 .0467184 -1.037 0.300 -.1400223 .0431104 _cons | 2.366605 2.770912 0.854 0.393 -3.064283 7.797493 ------------------------------------------------------------------------------ BIOST 536 Thompson
NOTES • The OR for nodal involvement corresponding to a ten year age difference is: estimated by .95310=.62 • The 95% CI for log(10βAGE) is given by: • Hence the 95% CI for the 10-year OR is given by: (.25,1.54) • This OR is the same comparing 40 year olds with 30 year olds as comparing 60 year olds with 50 year olds etc BIOST 536 Thompson
Multiple logistic regression Logit(P(Y=1| X1, X2, .., Xk) ) = β0 +β1 X1 + β2 X2 + …+ βk Xk BIOST 536 Thompson
Estimation Assume we have N observations (Yi, Xi1, Xi2, .., Xik), i=1,2,…,N As before, we can use maximum likelihood to obtain estimates of β0, β1, β2,…, βk that maximize the likelihood: L= and we can estimate the variances and covariances of the estimates from the inverse of the information matrix, I. BIOST 536 Thompson
Hypothesis testing The Wald, Likelihood Ratio and Score tests generalize to the case of k X variables. In general Full model: logit(p) = β0 +β1 X1 + β2 X2 + …+ βk Xk Reduced model: logit(p) = β0 +β1 X1 + β2 X2 + …+ βp Xp, , p<k H0 : βp+1 = βp+2 = …= βk =0 Ha : ≠0 somewhere BIOST 536 Thompson
Likelihood ratio test LR statistic = -2[ln L(reduced) -ln L(full)] = Deviance(reduced) - Deviance(full) Approximate distribution under H0 : Χ2k-p We must fit two models to calculate the LR statistic Stata provides LR test of the current model relative to the null model: H0 : β1 = β2 = …= βk =0 BIOST 536 Thompson
Score test • If H0 implies β = β* then Score statistic = S(β*)' I-1 S(β*) where I denotes the information matrix • Approximate distribution under H0 : Χ2k-p • Only need to fit the reduced model to calculate the Score statistic • Stata does not perform the Score test easily. BIOST 536 Thompson
Wald test • For a single parameter: ~ N(0,1) under H0 : βj=0. • The Wald test can be generalized to multiple parameters where it also follows a Χ2k-p distribution under H0. • Most confidence intervals are based on the Wald test statistic BIOST 536 Thompson
LR tests using Stata In general: Fit "full" model, then: . est store A saves log-likelihood from most recently fitted model and labels it “A" Fit reduced model, then: . est store B saves log-likelihood from most recently fitted model and labels it “B" Carry out the LR test comparing "full" model (A) with reduced model (B) . lrtest A B, stats BIOST 536 Thompson
Example: prostate cancer study BIOST 536 Thompson
Fitting “full” model: . logistic node tsize xray Logistic regression Number of obs = 53 LR chi2(2) = 16.90 Prob > chi2 = 0.0002 Log likelihood = -26.676709 Pseudo R2 = 0.2405 ------------------------------------------------------------------------------ node | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] ---------+-------------------------------------------------------------------- tsize | 4.895297 3.426809 2.269 0.023 1.241425 19.30357 xray | 8.326496 6.218498 2.838 0.005 1.926448 35.9888 ------------------------------------------------------------------------------ . logit ------------------------------------------------------------------------------ node | Coef. Std. Err. z P>|z| [95% Conf. Interval] ---------+-------------------------------------------------------------------- tsize | 1.588275 .7000206 2.269 0.023 .2162598 2.96029 xray | 2.119443 .7468325 2.838 0.005 .6556779 3.583208 _cons | -2.044627 .6099686 -3.352 0.001 -3.240144 -.8491109 ------------------------------------------------------------------------------ . est store A BIOST 536 Thompson
Fitting “reduced” model: . logistic node tsize Logistic regression Number of obs = 53 LR chi2(1) = 7.70 Prob > chi2 = 0.0055 Log likelihood = -31.276312 Pseudo R2 = 0.1096 ------------------------------------------------------------------------------ node | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] ---------+-------------------------------------------------------------------- tsize | 5.25 3.310487 2.630 0.009 1.52552 18.06761 ------------------------------------------------------------------------------------------------ . logit ------------------------------------------------------------------------------ node | Coef. Std. Err. z P>|z| [95% Conf. Interval] ---------+-------------------------------------------------------------------- tsize | 1.658228 .630569 2.630 0.009 .4223355 2.894121 _cons | -1.435085 .4976116 -2.884 0.004 -2.410385 -.4597837 ------------------------------------------------------------------------------ . est stor B BIOST 536 Thompson
Likelihood ratio test: comparing models for nodal involvement with and without effect of xray . lrtest A B, stats Likelihood-ratio test LR chi2(1) = 9.20 (Assumption: B nested in A) Prob > chi2 = 0.0024 ------------------------------------------------------------------------------ Model | Obs ll(null) ll(model) df AIC BIC -------------+---------------------------------------------------------------- B | 53 -35.12608 -31.27631 2 66.55262 70.49321 A | 53 -35.12608 -26.67671 3 59.35342 65.26429 ------------------------------------------------------------------------------ What hypothesis is this testing? BIOST 536 Thompson
Fitted probabilities in the “full” model: . predict pnode, p P(node | tumor=0, xray=0)=.1146 (.1429) P(node | tumor=1, xray=0)=.3879 (.3529) P(node | tumor=0, xray=1)=.5187 (.4000) P(node | tumor=1, xray=1)=.8407 (.9000) Note: these are slightly different from what we would get if we used the raw data without modelling. Why? BIOST 536 Thompson
Confidence intervals • A 100(1-α)% Likelihood Ratio based confidence region for β is given by: • Stata provides Wald-based CIs for individual parameters • CIs for odds ratios can be obtained by exponentiation BIOST 536 Thompson