480 likes | 1.52k Views
Conditional Logistic Regression. Up to this point, we only know how to make inference based on matched data to odds ratio without adjusting for any confounding variables.
E N D
Conditional Logistic Regression • Up to this point, we only know how to make inference based on matched data to odds ratio without adjusting for any confounding variables. • To make inference to the adjusted odds ratio, we specify a logistic regression model for each stratum allowing slope coefficients of unmatched variables to be homogeneous across strata but different intercepts to capture stratum-specific effects.
Conditional Logistic Regression • let be the unmatched variables , i.e., exposure +confounding variables+ concommitant variables. The model for the i-th stratum is • Problem: Since the number of parameters in the model grows as the sample size increases, unconditional likelihood would lead to biased estimators.
Conditional Logistic Regression • One solution to this problem is to eliminate the nuisance parameters through consideration of an appropriate conditional distribution. • we consider the simplest case where M=1. The extension of the methods to other matched designs is straightforward. Let the value of the X variables for the case in stratum i be denoted by Xi1 and the value for the control be denoted by Xi2
Suppose that for a given stratum we knew the unordered valuesXi1,Xi2 for the two subjects , but did not know which value were associated the case or which with the control. Consider the conditional probability of Xi1 belonging to the case, and Xi2 belonging to the control, given that we observed the unordered valuesXi1,Xi2 for the two subjects Conditional Logistic Regression
To calculate this conditional probability, we calculate the two probabilities: The probability of Xi1 belonging to the case, and Xi2 belonging to the control, and the probability of observing the unordered valuesXi1,Xi2 for the two subjects. Then the ratio of these two probabilities will be the desired conditional probability Conditional Logistic Regression
The probability of Xi1 belonging to the case, and Xi2 belonging to the control is the probability of observing the unordered valuesXi1,Xi2 for the two subjects is Conditional Logistic Regression
The desired conditional probability is Using Bayes’ rule, we can rewrite the above conditional probability as Conditional Logistic Regression
Finally, applying the logistic regression model for the ith stratum to each probability in the last expression, the desired conditional probability is Conditional Logistic Regression
Conditional Logistic Regression • The conditional likelihood is the product of the conditional probabilities for i=1,…,n as specified on the previous slide, i.e., • Conditional maximum likelihood estimate of is the value which maximizes L
Conditional Logistic Regression • Suppose that the i-th stratum contains Mi controls in addition to the case. Then the conditional likelihood function is • Which is direct generalization of the conditional likelihood function in the 1-1 matching case.
Example • To explore a variety of potential risk factors associated with benign breast disease, a study identified 50 women who had benign breast disease (case) . Each case was matched to three control women who had same age at interview. The resulting data set consists of 50 cases and 150 age matched controls.
Example Variable descriptioncode/value STR Stratum 1-50 AGMT Age of the subject at interview FNDX Final Diagnosis 1=case, 0=control HIGD Highest Grade in School DEG Degree 0=none, 1=High School, 2=Jr. College, 3=College, 4=Masters, 5=Doctoral CHK Regular Medical Checkup 1=Yes, 0=No AGP1 Age at first pregnancy AGMN Age at Menarche NLV Number of Stillbirths LIV Number of Live Births WT Weight of the subject AGLP Age at last Menstrual Period MST Martial Status 1=Married, 2=Divorced, 3=Separated, 4=Widowed, 5=Never Married
Example STa AGMT FNDX HIGD DEG CHK AGP1 AGMN NLV LIV WT AGLP MST 1 39 1 9 0 1 23 13 0 5 118 39 1 1 39 0 10 0 0 16 11 1 3 175 39 3 1 39 0 11 0 0 20 12 1 3 135 39 2 1 39 0 12 1 1 21 11 0 3 125 40 1 2 38 1 14 2 1 . 14 . . 118 39 1 2 38 0 12 1 0 20 15 0 2 183 38 1 2 38 0 9 0 0 19 11 0 5 218 38 1 2 38 0 13 1 1 23 13 0 2 192 37 1
Example libname dog "c:\yougui\linear_model"; data match13; set dog.match13; fndx1=2-fndx; run; procphreg data=match13; strata str; model fndx1 = chk agp1 / details ties=discrete rl; run;
Example The PHREG Procedure Analysis of Maximum Likelihood Estimates Parameter Standard Hazard 95% Hazard Ratio Variable DF Estimate Error Chi-Square Pr > ChiSq Ratio Confidence Limits chk 1 -1.25719 0.47091 7.1274 0.0076 0.284 0.113 0.716 agp1 1 0.11854 0.05308 4.9878 0.0255 1.126 1.015 1.249
Homework The data used in this homework is the study of the effect of exogenous oestrogens on the risk of endometrial cancer occurring in a retirement community near Los Angeles, California from 1971 to 1975. Each case was matched to two control women who were alive and living in the community at the time the case was diagnosed, who were born within one year of the case, who had the same martial status and who had entered the community at approximately the same time. In addition, controls were chosen from among women who had not had a hysterectomy prior to the time the case was diagnosed, and who were therefore still at risk for the disease. Information on the history of use of several specific types of medicines, including oestrogens, anti-hypertensives, sedatives and tranquilizers, was abstracted from the medical record of each case and control. Other abstracted data relate to pregnancy history,mention of certain disease, and obesity. You can download the dataset from my course web site The analysis of these data is aimed at studying the risk associated with the use of oestrogens as well as with a history of gall bladder disease. Carry out the analysis and interpret your findings.