290 likes | 385 Views
April 11. Logistic Regression Modeling interactions Analysis of case-control studies Data presentation. Subgroup Analyses Journal Tables. Treatment A Treatment B OR Overall 100 150 0.67 Men 40 90 x.xx Women 60 60 x.xx Age <50 25 30 x.xx Age 51-60 35 50 x.xx
E N D
April 11 • Logistic Regression • Modeling interactions • Analysis of case-control studies • Data presentation
Subgroup AnalysesJournal Tables Treatment A Treatment B OR Overall 100 150 0.67 Men 40 90 x.xx Women 60 60 x.xx Age <50 25 30 x.xx Age 51-60 35 50 x.xx Age 60 + 40 70 x.xx SBP < 160 40 70 x.xx SBP ≥ 160 60 80 x.xx Is there any evidence that the effect of treatment differs among subgroups
TOMHS Example • Question: Does the effect of active BP treatment on CVD differ for young versus older persons? • Looking at an interaction effect (effect modification) • Compare • Odds CVD (treatment/placebo) in younger patients • Odds CVD (treatment/placebo) in older patients
Logistic regression equation Model log odds of outcome as a linear function of one or more variables Xi = predictors, independent variables b is increase in log odds of 1-unit increase in X ebis relative odds of a 1-unit increase in X The model is:
Logistic Model For Interaction X1 = 1 for active treatment and 0 for placebo X2 = 1 for age ≥ 55 and 0 for age < 55 X3 = X1 * X2
Log Odds (placebo, young) = b0 Log Odds (active, young) = b0 + b1 Log Odds (placebo, old) = b0 + b2 Log Odds (active, old) = b0 + b1 + b2 + b3 Dif = b1; exp(b1) is odds (A v P) for young Dif = b1 + b3; exp(b1 + b3 ) is odds (A v P) for old Logistic Model For Interaction X1 = 1 for active treatment and 0 for placebo X2 = 1 for age ≥ 55 and 0 for age < 55 X3 = X1 * X2
Log Odds (placebo, young) = b0 Log Odds (active, young) = b0 + b1 Log Odds (placebo, old) = b0 + b2 Log Odds (active, old) = b0 + b1 + b2 + b3 exp(b1) is odds (A v P) for young exp(b1 + b3 ) is odds (A v P) for old Odds (A v P) for Old exp(b1 + b3) exp (b3) = Odds (A v P) for Young exp (b1) What does b3 Mean? = A ratio of ratios!!
Interaction Hypothesis Ho: b3 = 0 Ha: b3≠ 0 Test in SAS just like any other coefficient
TOMHS: Overall Effect of Active Treatment PROCMEANSDATA=temp NMEANSUM; CLASS active; VAR cvd; RUN; Analysis Variable : cvd N active Obs N Mean Sum ============================================================ 0 234 234 0.1623932 38.0000000 1 668 668 0.1107784 74.0000000 ============================================================ Active: 38/234 or 11.1% Placebo: 74/668 or 16.2% RR = 0.68 (32% lower rate of CVD with active treatment)
OVERALL (ACTIVE VERSUS PLACEBO) The LOGISTIC Procedure Analysis of Maximum Likelihood Estimates Standard Wald Parameter DF Estimate Error Chi-Square Pr > ChiSq Intercept 1 -1.6405 0.1773 85.6626 <.0001 active 1 -0.4423 0.2159 4.1964 0.0405 Odds Ratio Estimates Point 95% Wald Effect Estimate Confidence Limits active 0.643 0.421 0.981 Active group at 36% lower risk of CVD compared to placebo.
Reading DATA and Creating indicator variables and interaction variable LIBNAME tomhs 'C:/'; DATA temp; SET tomhs.bpstudy; cvd = second; if group = 6then active = 0; else active = 1; if age < 55then old = 0; else old =1; *compute interaction term (x3); active_old = active*old;
* Get simple counts and proportions first; PROCMEANSDATA=temp NMEANSUM; CLASS old active; VAR cvd; RUN; The MEANS Procedure Analysis Variable : cvd N old active Obs N Mean Sum =========================================================================== 0 0 115 115 0.1565217 18.0000000 1 350 350 0.0714286 25.0000000 1 0 119 119 0.1680672 20.0000000 1 318 318 0.1540881 49.0000000 It appears the effect of treatment is mostly in younger patients
PROCLOGISTICDATA=temp DESCENDING; MODEL CVD = active old active_old; CONTRAST'A v P (Young)' active 1 /ESTIMATE=BOTH; CONTRAST'A v P (Old)' active 1 active_old 1 /ESTIMATE=BOTH; * Will give us beta1 + beta 3; RUN;
SAS OUTPUT Response Profile Ordered Total Value cvd Frequency 1 1 112 2 0 790 Probability modeled is cvd=1. Testing Global Null Hypothesis: BETA=0 Test Chi-Square DF Pr > ChiSq Likelihood Ratio 15.7787 3 0.0013 Score 14.7851 3 0.0020 Wald 14.0735 3 0.0028
The LOGISTIC Procedure Analysis of Maximum Likelihood Estimates Standard Wald Parameter DF Estimate Error Chi-Square Pr > ChiSq Intercept 1 -1.6843 0.2566 43.0730 <.0001 active 1 -0.8806 0.3301 7.1180 0.0076 old 1 0.0850 0.3549 0.0573 0.8108 active_old 1 0.7771 0.4395 3.1261 0.0770 Odds Ratio Estimates Point 95% Wald Effect Estimate Confidence Limits active 0.415 0.217 0.792 old 1.089 0.543 2.183 active_old 2.175 0.919 5.147 b1 b2 b3 2.175 = 0.90/.415 Ratio of Odds Ratios Odds CVD (A v P) for younger patients = exp(b1) = 0.415 Odds CVD (A v P) for older patients = exp(b1 + b3) = exp(-0.11) = 0.90
CONTRAST'A v P (Old)' active 1 active_old 1 /ESTIMATE=BOTH; Computes 1*beta1 + 0*beta2 + 1*beta3 = beta1 + beta3 Plus test and 95%CI Contrast Rows Estimation and Testing Results Standard Lower Upper Contrast Type Row Estimate Error Alpha Limit Limit A v P (Young) PARM 1 -0.8806 0.3301 0.05 -1.5275 -0.2337 A v P (Young) EXP 1 0.4145 0.1368 0.05 0.2171 0.7916 A v P (Old) PARM 1 -0.1035 0.2902 0.05 -0.6723 0.4653 A v P (Old) EXP 1 0.9017 0.2617 0.05 0.5105 1.5925 Exp(b1 + b3) b1 + b3
Description of Findings Patients in the active group were at 36% lower risk of CVD compared to the placebo group (OR: 0.64; 95% CI:0.42-0.98). Analyses by age showed that the benefit for active treatment was greatest in younger patients. In patients < age 55 the CVD risk was 58% lower in the active treatment (OR: 0.42) where for patients over 55 years of age the CVD risk was only 10% lower (OR:.90). The test for interaction between treatment and age approached significance (p=.07).
Logistic Regression forCase Control Studies • Same analyses as prospective study • Outcome: • Y = 1 is a case • Y = 0 is a control • Model log (odds) of being a case • Odds ratios have same meaning • Estimating probability of being a case not appropriate
Example Colon Polyp Study • Cases (N=574) • Patients diagnosed with colorectal polyps from colonoscopy • Controls (N=707) • Patients clear of colorectal polyps from colonoscopy • Risk Factors Under Study • FH of colon cancer • Smoking and alcohol • Reproductive history factors • Obesity and adiposity (weight to hip measures)
Example Colon Polyp Study • Variables • CC Status (1=case, 2=control) • Age (years) • FH colon cancer (1=Y, 0=N) • Current Smoking (1=Y, 0=N) • Gender (1=Men, 0 = Women) • Waist to Hip Ratio • Variables Names • CC, AGE, FHCC, SMOKERS, MEN, and WHIP
PROCLOGISTICDATA=temp ; MODEL cc = age fhcc smokers men whip; UNITS whip = 0.1 ; Response Profile Ordered Total Value CC1 Frequency 1 1 561 2 2 690 Probability modeled is CC=1. Testing Global Null Hypothesis: BETA=0 Test Chi-Square DF Pr > ChiSq Likelihood Ratio 165.4379 5 <.0001 Score 155.7546 5 <.0001 Wald 139.8082 5 <.0001
Analysis of Maximum Likelihood Estimates Standard Wald Parameter DF Estimate Error Chi-Square Pr > ChiSq Intercept 1 -4.0683 0.5953 46.7054 <.0001 AGE 1 0.0497 0.00618 64.8156 <.0001 FHCC 1 -0.4434 0.1505 8.6798 0.0032 smokers 1 0.5272 0.1623 10.5537 0.0012 men 1 0.8379 0.1503 31.0610 <.0001 WHIP 1 0.7491 0.6287 1.4197 0.2335 Odds Ratio Estimates Point 95% Wald Effect Estimate Confidence Limits AGE 1.051 1.038 1.064 FHCC 0.642 0.478 0.862 smokers 1.694 1.233 2.329 men 2.312 1.722 3.104 WHIP 2.115 0.617 7.253 UNITS whip = 0.1 ; Effect Unit Estimate 95% Confidence Limits WHIP 0.1000 1.078 0.953 1.219
Interaction Model • Is relationship of waist to hip ratio different for men and women • Define interaction term • Whip * men
PROCLOGISTICDATA=temp DESCENDING; MODEL cc = age fhcc smokers men whip whip_men; Analysis of Maximum Likelihood Estimates Standard Wald Parameter DF Estimate Error Chi-Square Pr > ChiSq Intercept 1 -2.4771 0.8632 8.2349 0.0041 AGE 1 0.0511 0.00626 66.7467 <.0001 FHCC 1 -0.4528 0.1511 8.9866 0.0027 smokers 1 0.5487 0.1631 11.3203 0.0008 men 1 -2.5148 1.3103 3.6838 0.0549 WHIP 1 -1.2470 1.0235 1.4846 0.2231 whip_men 1 3.7225 1.4392 6.6897 0.0097 Odds Ratio Estimates Point 95% Wald Effect Estimate Confidence Limits AGE 1.052 1.040 1.065 FHCC 0.636 0.473 0.855 smokers 1.731 1.257 2.383 men 0.081 0.006 1.055 WHIP 0.287 0.039 2.136 whip_men 41.367 2.464 694.576 P-value for women
Some Practical Aspects for Analyses • Divide continuous variable of interest into 3-5 categories and compute relative odds for increasing categories. • Summarize results using beta coefficient using factor as continuous variable.
Advantages • Can determine if risk increases linearly with increasing levels of factor • No assumptions of pattern of risk when using categories • Can determine if there is a threshold effect • Eliminates possible effect of outliers.
Analysis • Create indicator variables for quintiles of omega-3 and run logistic regression • Run regression using omega-3 as continuous variable
In Class Exercise • Investigate whether the odds of CVD increases linearly with age • Divide age into 4-categories • < 50; 50-54; 55-59; 60+ • Two CVD endpoints: • Clinical – major CVD • Second – major + minor CVD • Compute percent with CVD with each age category • Run logistic regression with 3 indicator variables using < 50 as reference group • Run logistic regression using age as continuous variable