1 / 50

The %LRpowerCorr10 SAS Macro Power Estimation for Logistic Regression

The %LRpowerCorr10 SAS Macro Power Estimation for Logistic Regression Models with Several Predictors of Interest in the Presence of Covariates D. Keith Williams M.P.H. Ph.D. Zoran Bursac M.P.H. Ph.D. Department of Biostatistics University of Arkansas for Medical Sciences.

Download Presentation

The %LRpowerCorr10 SAS Macro Power Estimation for Logistic Regression

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The %LRpowerCorr10 SAS Macro Power Estimation for Logistic Regression Models with Several Predictors of Interest in the Presence of Covariates D. Keith Williams M.P.H. Ph.D. Zoran Bursac M.P.H. Ph.D. Department of Biostatistics University of Arkansas for Medical Sciences

  2. The Premise for Linear and Logistic Regression Power and Sample Size • Power to detect significance among specific predictors in the presence of other covariates in a model. • For linear regression Proc Power works great! • Logistic regression power estimation is ‘quirky’

  3. Common Approaches to Estimate Logistic Regression Power • Power for one predictor possibly in the presence of other covariates. • There may exist correlation among these predictors using %powerlog macro • A weakness…commonly we are interested in power to detect the significance of more than one predictor

  4. A quick look at %LRpowerCorr10

  5. LRpowerCorr10 • Up to 10 predictors • 2 binary, 4 uniform (-3,3), and 4 normal • Specify a correlation among predictors • Specify an odds ratio value for the predictors • Specify the set of factors of interest and the set of covariates

  6. A Power Scenario logit = -2.2 + ln (1.5) x1 + ln(1.5) x2 + ln(1.1) x3 + ln(1.05) x4 + ln(1.02) x5 + ln(1.05) x6 + ln(1.01) x7 + ln(1.05)x8 +ln( 1.02) x9 + ln(1.03) x10 Risk factors of interest Covariates of interest

  7. %LRpowerCorr Example %LRpowerCorr10(2000,1000,.2,.1, 1.5,1.5, 1.1, 1.05,1.02,1.05, 1.01,1.05,1.02,1.03, cx1 cx2 cx3 cx4 cx5 cx6 cx7 cx8 cx9 cx10, cx4 cx5 cx6 cx7 cx8 cx9 cx10, .05, 3, 0.1,0.5); mean number of ‘1’s The 3 risk factors of interest Correlation among predictors n number of simulations Full model Reduced model Level of signficance The number of terms of interest Prob of ‘1’ for the binary cx1 and cx2

  8. %LRpowerCorr10 Output Sample size = 2000; Simulations = 1000; Rho = .2; P(Y=1) = .1 OR1=1.5, OR2=1.5, OR3=1.1, OR4=1.05, OR5=1.02,OR6=1.05 OR7=1.01, OR8=1.05, OR9=1.02, OR10=1.03 Full Model: cx1 cx2 cx3 cx4 cx5 cx6 cx7 cx8 cx9 cx10 Reduced Model: cx4 cx5 cx6 cx7 cx8 cx9 cx10 Power LCL UCL 88% 86% 90%

  9. A look at regular linear regression.The basic structure is the same.

  10. A Key Point about Linear Regression • We rarely have a conjectured values for particular betas in a regular linear regression • Therefore for linear regression models, one conjectures the difference in R-square between a model that includes predictors of interest and a model without these predictors.

  11. Example Data Set

  12. The Hypothetical ScenarioA model with 4 terms Predictors for PSA of interest that we choose to power: • SVI • c_volume Two Covariates to be included : cpen, gleason

  13. Details The full model We want to power the test that a model with these 2 predictors is statistically better than a model excluding them. The reduced model

  14. The Corresponding Hypothesis H(o): H(a): At least one of the above is non-zero in the full model when the difference in Rsquare = ?

  15. Lets go back through those last 3 slides again

  16. Note Hypothetical Full Model Predictors of interest

  17. Hypothetical Reduced Model Note R-Square difference 0.45 – 0.34= 0.11

  18. procpower ; multreg model=fixed alpha= .05 nfullpredictors= 4 ntestpredictors= 2 rsqfull=0.45 rsreduced=0.34 ntotal= 978070605040 power=. ; plot x=n min=40 max=100 key = oncurves yopts=(ref=0.8.977 crossref=yes) ; run; The POWER Procedure Type III F Test in Multiple Regression Fixed Scenario Elements Method Exact Model Random X Number of Predictors in Full Model 4 Number of Test Predictors 2 Alpha 0.05 R-square of Full Model 0.45 Difference in R-square 0.11 Computed Power N Index Total Power 1 97 0.979 2 80 0.949 3 70 0.916 4 60 0.864 5 50 0.787 6 40 0.677

  19. Now the logistic regression case

  20. Logistic Regression LR test review

  21. Model Fit Statistics Intercept Intercept and Criterion Only Covariates AIC 124.318 113.996 SC 126.903 139.846 -2 Log L 122.318 93.996 The SAS System Analysis of Maximum Likelihood Estimates Standard Wald Parameter DF Estimate Error Chi-Square Pr > ChiSq Intercept 1 -5.5161 2.2471 6.0260 0.0141 age 1 0.0646 0.0583 1.2294 0.2675 sesdum2 1 -1.7862 3.0841 0.3354 0.5625 sesdum3 1 0.2955 2.2550 0.0172 0.8957 sector 1 2.9796 1.2481 5.6988 0.0170 age_ses2 1 0.1054 0.0559 3.5514 0.0595 age_ses3 1 0.0140 0.0316 0.1952 0.6586 age_sect 1 -0.0342 0.0309 1.2231 0.2688 ses2_sect 1 -0.3094 1.4409 0.0461 0.8300 ses3_sect 1 -0.7396 1.2489 0.3507 0.5537

  22. Model Fit Statistics Intercept Intercept and Criterion Only Covariates AIC 124.318 111.054 SC 126.903 123.979 -2 Log L 122.318 101.054 Analysis of Maximum Likelihood Estimates Standard Wald Parameter DF Estimate Error Chi-Square Pr > ChiSq Intercept 1 -3.8874 0.9955 15.2496 <.0001 age 1 0.0297 0.0135 4.8535 0.0276 sesdum2 1 0.4088 0.5990 0.4657 0.4950 sesdum3 1 -0.3051 0.6041 0.2551 0.6135 sector 1 1.5746 0.5016 9.8543 0.0017

  23. The Corresponding Hypothesis H(o): H(a): At least one of the above is non-zero in the full model LRchisq = 101.054 – 93.996 = 7.0582 Pvalue = 0.22 (Implies none are helpful)

  24. Power for Logistic Models Background • Most existing tools are based on Hsieh, Block, and Larsen (1998) paper,and Agresti (1996) text. • %powerlog macro and other software. • Recent publication by Demidenko (2008)

  25. SAS 9.2 Proc Power for Logistic The LOGISTIC statement performs power and sample size analyses for the likelihood ratio chi-square test of a single predictor in binary logistic regression, possibly in the presence of one or more covariates. All predictor variables are assumed to be independent of each other. So, this analysis is not applicable to studies with correlated predictors — for example, most observational studies (as opposed to randomized studies).

  26. Common Approaches to Estimate Logistic Regression Power • Calculate the power to detect significance of one predictor possibly in the presence of other predictors. • There may exist correlation among these predictors using %powerlog macro • A weakness…In many instances we are interested in power to detect the significance of more than one predictor

  27. A demonstration of the %Powerlog macro

  28. The %PowerLog MacroLogistic Regression • Power for a one s.d. unit increase from the mean of X1 • Any number of other covariates in the model are accounted for by putting the R-Square of a regular regression model:

  29. %Powerlog Function Example %powerlog(p1=.5, p2=.6667, power=.8,rsq=%str(0,.0565, .1141),alpha=.05); Prob of 1 at mean of X1 Prob of 1 at mean + SD of X1 Three hypothetical values of the rsquare of X1 regressed on any number of other covariates

  30. %Powerlog Output Alpha=.05, p1=.5 p2=.6667

  31. %LRpowerCorr10 versus %powerlog n=70 Sample size = 70; Simulations = 1000; Rho = 0; P(Y=1) = .5 OR1=1, OR2=1, OR3=1, OR4=1, OR5=1, OR6=1 OR7=2, OR8=1, OR9=1, OR10=1 Full Model: cx7 cx8 cx9 cx10 Reduced Model: cx8 cx9 cx10 Power LCL UCL 79% 76% 81%

  32. %LRpowerCorr10 versus %powerlog n=75 Sample size = 75; Simulations = 1000; Rho = .1; P(Y=1) = .5 OR1=1, OR2=1, OR3=1, OR4=1, OR5=1, OR6=1 OR7=2, OR8=1, OR9=1, OR10=1 Full Model: cx7 cx8 cx9 cx10 Reduced Model: cx8 cx9 cx10 Power LCL UCL 81% 78% 83%

  33. %LRpowerCorr10 versus %powerlog n=80 Sample size = 80; Simulations = 1000; Rho = .2; P(Y=1) = .5 OR1=1, OR2=1, OR3=1, OR4=1, OR5=1, OR6=1 OR7=2, OR8=1, OR9=1, OR10=1 Full Model: cx7 cx8 cx9 cx10 Reduced Model: cx8 cx9 cx10 Power LCL UCL 80% 77% 82%

  34. Again…only one predictor of interest using %powerlog

  35. The %LRpowerCorr10 Macro • Power Estimation • One or more predictors of interest • Different distributions of predictors • Other covariates in model • Correlation among predictors • Specify OR values associated with predictors • Average proportion of ‘1’s

  36. %LRpowerCorr (N, Simulations, Correlation) Define logit: Specify associations between each covariate x and outcome y through parameter estimate . Loop Sample of size N from the specified logit. Convert logits to binary. PROC LOGISTIC: fit the full multivariate model. Save -2LnLikelihood. PROC LOGISTIC: fit the reduced multivariate model. Save -2LnLikelihood. Perform Likelihood Ratio test. (The difference in the reduced and full -2LnLikelihoods) Is the resulting chi-square test statistic> chi-square critical value? (With respect to correct number of d.f.) If so reject the null. If not fail to reject the null. Save the result. Calculate the proportion of correct rejections (i.e. power to detect the specified associations)

  37. %LRpowerCorr10 Variables

  38. Example from HosmerApplied Logistic Regression‘The low birth weight study’ Primary Risk Factors of Interest Confounders

  39. We wish to find the power to detect significance for at least one of the risk factors in the full model Full Model Reduced Model

  40. The Corresponding Hypothesis H(o): H(a): At least one of the above is non-zero in the full model

  41. Hypothesized Odds Ratios • AGE OR=1.1 (CX7) Normal • LBT OR=1.5 (CX1) Binary • RACE OR=1.5 (CX2) Binary • FTV OR=1.1 (CX3) Uniform • SMOKE OR=1.02 (CX8) Normal • PLT OR=1.02 (CX9) Normal • HT OR=1.02 (CX10) Normal • UI OR=1.02 (CX4) Uniform • P(Y=1)=0.1 Investigate N = 900 • Rho=0.2

  42. Macro Commands %LRpowerCorr10 (900,1000,.2,.1,1.5,1.5, 1.1,1.02,1.02,1.02, 1.1,1.02,1.02,1.02, cx1 cx2 cx3 cx7 cx4 cx8 cx9 cx10 , cx4 cx8 cx9 cx10, .05, 4, 0.25,0.5);

  43. Output Sample size = 900; Simulations = 1000; Rho = .2; P(Y=1) = .1 OR1=1.5, OR2=1.5, OR3=1.1, OR4=1.02, OR5=1.02,OR6=1.02 OR7=1.1, OR8=1.02, OR9=1.02, OR10=1.02 Full Model: cx1 cx2 cx3 cx7 cx4 cx8 cx9 cx10 Reduced Model: cx4 cx8 cx9 cx10 Power LCL UCL 73% 70% 75%

  44. Recent Development %Quickpower Macro %quickpower2(100,.2,.1, 1.5,1.5, 1.1,1.02,1.02,1.02, 1.1,1.02,1.02,1.02, cx1 cx2 cx3 cx7 cx4 cx8 cx9 cx10 , cx4 cx8 cx9 cx10, 8, .05, 4, 0.25,0.5);

  45. A trick to get a good guess for N The POWER Procedure Type III F Test in Multiple Regression Fixed Scenario Elements Method Exact Model Random X Number of Predictors in Full Model 8 Number of Test Predictors 4 Alpha 0.05 R-square of Full Model 0.01971 R-square of Reduced Model 0.007397 Nominal Power 0.8 Computed N Total Actual N Power Total 0.800 962

  46. Resulting in… Sample size = 962; Simulations = 1000; Rho = .2; P(Y=1) = .1 OR1=1.5, OR2=1.5, OR3=1.1, OR4=1.02, OR5=1.02,OR6=1.02 OR7=1.1, OR8=1.02, OR9=1.02, OR10=1.02 Full Model: cx1 cx2 cx3 cx7 cx4 cx8 cx9 cx10 Reduced Model: cx4 cx8 cx9 cx10 Power LCL UCL 76% 74% 79%

  47. LRpowerCorr C MacroApproximate Power Curve • %LRpowerCorr10C (50,150,500,.1,.5, • 1,1, • 1,1,1,1, • 2.0,1,1,1, • cx7 cx8 cx9 cx10, • cx8 cx9 cx10, • .05, • 1, • .25,.25); • ods graphics on; • proclogistic data=base desc plots(only)=(roc(id=obs) effect); • model reject=n1/; • run; • ods graphics off;

  48. The SAS Macros • www.uams.edu/biostat/williams • Text file versions of the %LRpowerCorr and %quickpower SAS macros with an example • Copy and paste into SAS to run.

More Related