410 likes | 654 Views
Cohort studies: Statistical analysis. Jan Wohlfahrt Department of Epidemiology Research Statens Serum Institut. Contents. A research question and a wrong answer What kind of data is needed How to analyse data Confounder adjustment Poisson regression Cox regression.
E N D
Cohort studies:Statistical analysis Jan Wohlfahrt Department of Epidemiology Research Statens Serum Institut
Contents • A research question and a wrong answer • What kind of data is needed • How to analyse data • Confounder adjustment • Poisson regression • Cox regression Danish Epidemiology Science Centre, Copenhagen, Denmark
1.1. A research question Do MMR vaccination increase the risk of autistic disorder ? Danish Epidemiology Science Centre, Copenhagen, Denmark
1.1. Material • All children born 1991 to 98 (537.000 children). • Registerbased information on MMR vacn. (441.000) and autistic disorder (412 cases). Inform. on autisme: Danish Psychiatric Central Register Danish Civil Registration System Inform. on MMR Danish National Board of Health Cohort Danish Epidemiology Science Centre, Copenhagen, Denmark
1.2. A wrong answer I What is the proportion of children with autism in vacn and non-vacn in the cohort before end of 2000(end of follow-up)? Relative risk = 0.079/0.064= 1.23. Danish Epidemiology Science Centre, Copenhagen, Denmark
1.2. A wrong answer II • The simple comparison of proportion is not correct, because: • autism may be diagnosed before MMR • no age-adjustment, time under risk not taken into account, • Conclusion: Compare person-time under risk, not the number of persons under risk. Danish Epidemiology Science Centre, Copenhagen, Denmark
2.1. Information on time • Time of study-entrance (1yr birthdate) • Time of status-change (date of vaccination) • Time of outcome (date of autism) • Time of study-exit (date of autism, death, emigration, disappearance, end of study) Danish Epidemiology Science Centre, Copenhagen, Denmark
2.2 Datalines . Not before end of 2000
3.1 Cox vs. Poisson regresssion • Poisson regression in large datasets with time-dependent variables • Cox regression in small datasets
3.6 Rate ratio calculation (Incidence) rate = number of new autistic cases per year = cases/pyrs Rate ratio = RR+vacn vs –vacn = rate+vacn/rate-vacn= 1.40
4.3 Person-years by age and period (9 ages) x (9 periods) x (two vacn.) =162 groups e.g. Age period vacn pyrs cases 4 1996 yes 54626 9
4.4 Relative rates by age and period 1 in thousands, 2 per 100000 yr
5.1 Regression analysis of the rates log(rate) = const + aI(vacn) + bI(5-9) + cI(96-00) I(vacn) = 1 if vacn, 0 otherwise I(5-9) = 1 if 5-9 years, 0 otherwise I(96-00) = 1 in period 1996-2000, 0 otherwise For non-vacn. children in 1997 aged 6 log(rate) is modelled by: const+b+c.
5.2 Log-linear Poisson regression (I) log(rate) = log((nr of cases)/pyrs) = log(nr of cases) - log(pyrs) i.e. log(nr of cases) = log(pyrs) + log(rate) log(rate) = const + aI(vacn) + bI(5-9) + cI(96-00) log(nr of cases) = log(pyrs) + const + aI(vacn) + bI(5-9) + cI(96-00)
5.3 Log-linear Poisson regression (II) • log(nr of cases) = • log(pyrs) + const + aI(vacn) + bI(5-9) + cI(96-00) • The number of case is Poisson-distributed. • log of the number of cases is modelled with a linear-function • log(pyrs) is considered known for every cell and is called an offset
5.4 Parameters and rate ratios log(rate) = k + aI(vacn) + bI(5-9) + cI(96-00) rate = exp(k + aI(vacn) + bI(5-9) + cI(96-00)) = exp(k)exp(aI(vacn)exp(bI(5-9))exp(cI(96-00)). For children 5-9 yr in the period 1996-2000: RR+vacn vs -vacn = rate+vacn/rate-vacn = (exp(k)exp(a)exp(b)exp(c)) (exp(k)exp(b)exp(c)) = exp(a)
5.5 A more complicated model log(rate) = k + aI(vacn) + b1I(1yr) + b2I(2yr) + b3I(3yr) + b4I(4yr) + b5I(5yr) + b6I(6yr) + b7I(7yr) + b8I(8yr) + c1I(92-93) + c2I(94) + c3I(95) + c4I(96) + c5I(97) + c6I(98) +c7I(99) + with non-vacn as the vacn-reference, age=9yr as the age-reference, and period=2000 as the period-reference.
5.6 SAS-dataset to Poisson regression data mmrdata; input age period vacn cases pyrs; logpyrs=log(pyrs); datalines; 1 92 0 0 20301.68 1 92 1 0 12027.50 . . . . . . . . . . 8 00 0 0 9553.12 8 00 1 2 54829.64 9 00 0 1 4844.91 9 00 1 0 26937.23 ; run;
5.7 SAS-procedure to Poisson regression proc genmod data=mmrdata; class age period; model cases=age period vacn/ dist=poisson link=log offset= logpyrs ; run;
5.8 SAS-output Parameter DF Estimate Std Err ChiSquare Pr>Chi INTERCEPT 1 -10.2733 1.0063 104.2281 0.0001 AGE 1 1 0.0720 1.0488 0.0047 0.9453 AGE 2 1 1.6150 1.0137 2.5381 0.1111 AGE 3 1 2.4219 1.0088 5.7637 0.0164 AGE 4 1 2.3435 1.0093 5.3913 0.0202 AGE 5 1 2.0080 1.0118 3.9386 0.0472 AGE 6 1 1.6608 1.0168 2.6679 0.1024 AGE 7 1 1.0864 1.0338 1.1044 0.2933 AGE 8 1 0.4626 1.0966 0.1780 0.6731 AGE 9 0 0.0000 0.0000 . . PERIOD 1992 1 -1.4554 0.7289 3.9869 0.0459 PERIOD 1994 1 -0.6997 0.3148 4.9397 0.0262 PERIOD 1995 1 -0.9527 0.2619 13.2350 0.0003 PERIOD 1996 1 -0.6582 0.2033 10.4808 0.0012 PERIOD 1997 1 -0.3866 0.1728 5.0079 0.0252 PERIOD 1998 1 0.0366 0.1478 0.0614 0.8044 PERIOD 1999 1 0.1157 0.1423 0.6614 0.4161 PERIOD 2000 0 0.0000 0.0000 . . VACN 1 1 -0.1111 0.1348 0.6791 0.4099 VACN 999 0 0.0000 0.0000 . .
5.9 Confidence-interval RR+vacn vs –vacn = exp(-0.1111) = 0.89 Confidence-interval: RRlower= exp(estimate - 1.96StdErr) RRupper= exp(estimate + 1.96StdErr) RR+vacn vs -vacn= 0.89 (0.69-1.17)
X.X Time since vaccination 0.8 years 5.9 years
6.2 Cox regression log(rate) = k + aI(vacn) + b1I(1yr) + b2I(2yr) + b3I(3yr) b4I(4yr) + b5I(5yr) + b6I(6yr) + b7I(7yr) + b8I(8yr) + c1I(92-93) + c2I(94) + c3I(95) + c4I(96) + c5I(97) + c6I(98) +c7I(99) + l(age)
6.4 Data to Cox-regression data coxdata; input @1 intime date7. @9 vactime date7. @17 auttime date7. @25 othtime date7.; datalines; 11sep95 04apr97 17oct97 . 13dec94 . . 24jan00 27jan90 . . . 23jul93 04nov95 01jan98 . 15nov00 . . . 15jun97 03apr99 . 15apr01 03may92 . 06nov95 . ..... run;
6.5 Cox SAS-program data coxdata2; set coxdata; outtime=min(auttime,othtime,"31dec2000"d); time=(outtime-intime); if auttime=outtime then status=1; else status=0; run; procphreg; model time*status(0)=vacn; if (vactime=. or time<(vactime-intime)) then vacn=0; else vacn=1; run;