Designing longitudinal studies in epidemiology

1. Designing longitudinal studiesin epidemiology Donna Spiegelman Professor of Epidemiologic Methods Departments of Epidemiology and Biostatistics stdls@channing.harvard.edu Xavier Basagana Doctoral StudentDepartment of Biostatistics, Harvard School of Public Health

2. Background We develop methods for the design of longitudinal studies for the most common scenarios in epidemiology There already exist some formulas for power and sample size calculations in this context. All prior work has been developed for clinical trials applications

3. Based on clinical trials: Some are based on test statistics that are not valid or less efficient in an observational context, where (e.g. ANCOVA). Background

7. Notation and Preliminary Results

8. We study two alternative hypotheses:

10. Intuitive parameterization of the alternative hypothesis the mean response at baseline (or at the mean initial time) in the unexposed group, where the percent difference between exposed and unexposed groups at baseline (or at the mean initial time), where

11. Intuitive parameterization of the alternative hypothesis (2) : the percent change from baseline (or from the mean initial time) to end of follow-up (or to the mean final time) in the unexposed group, where When is not fixed, is defined at time s instead of at time : the percent difference between the change from baseline (or from the mean initial time) to end of follow-up (or mean final time) in the exposed group and the unexposed group, where When , will be defined as the percent change from baseline (or from the mean initial time) to the end of follow-up (or to the mean final time) in the exposed group, i.e.

12. We consider studies where the interval between visits (s) is fixed but the duration of the study is free (e.g. participants may respond to questionnaires every two years) Increasing r involves increasing the duration of the study We also consider studies where the duration of the study, ?, is fixed, but the interval between visits is free (e.g. the study is 5 years long) Increasing r involves increasing the frequency of the measurements, s ? = s r.

13. Model The generalized least squares (GLS) estimator of B is Power formula Notation & Preliminary Results

14. Let ?lm be the (l,m)th element of ?-1 Assuming that the time distribution is independent of exposure group. Then, under CMD Under LDD

15. We consider three common correlation structures: Compound symmetry (CS).

16. Damped Exponential (DEX)

17. Random intercepts and slopes (RS). Reparameterizing: is the reliability coefficient at baseline is the slope reliability at the end of follow-up ( =0 is CS; =1 all variation in slopes is between subjects). With this correlation structure, the variance of the response changes with time, i.e. this correlation structure gives a heteroscedastic model.

18. Goal is to investigate the effect of indicators of socioeconomic status and post-menopausal hormone use on cognitive function (CMD) and cognitive decline (LDD) �Pilot study� by Lee S, Kawachi I, Berkman LF, Grodstein F (�Education, other socioeconomic indicators, and cognitive function. Am J Epidemiol 2003; 157: 712-720). Will denote as Grodstein. Design questions include power of the published study to detect effects of specified magnitude, the number and timing of additional tests in order to obtain a study with the desired power to detect effects of specified magnitude, and the optimal number of participants and measurements needed in a de novo study of these issues

19. At baseline and at one time subsequently, six cognitive tests were administered to 15,654 participants in the Nurses� Health Study Outcome: Telephone Interview for Cognitive Status (TICS) ?00=32.7 (4); Implies model = 1 point/10 years of age

20. Exposure: Graduate school degree vs. not (GRAD) Corr(GRAD, age)=-0.01 points Exposure: Post-menopausal hormone use (CURRHORM) Corr(CURRHORM, age)=-0.06 points Time: age (years) is the best choice, not questionnaire cycle or calendar year of test The mean age was 74 and V(t0)?4.

21. The estimated covariance parameters were SAS code to fit the LDD model with CS covariance proc mixed; class id; model tics=grad age gradage/s; random id; SAS code to fit the LDD model with RS covariance proc mixed; class id; model tics=grad age gradage/s ddfm=bw; Random intercept age/type=un subject=id;

22. Program optitxs.r makes it all possible

35. Illustration of use of softwareoptitxs.r We�ll calculate the power of the Grodstein�s published study to detect the observed 70% difference in rates of decline between those with more than high school vs. others Recall that 6.2% of NHS had more than high school; there was a �0.3% decline in cognitive function per year

38. Power of current study To detect the observed 70% difference in cognitive decline by GRAD CS: 44% RS: 35% DEX : 42% To detect a hypothesized �10% difference in cognitive decline by current hormone use CS & DEX: 7% RS: 6%

39. How many additional measurements are needed when tests are administered every 2 years how many more years of follow-up are needed... To detect the observed 70% difference in cognitive decline by GRAD with 90% power? CS, DEX , RS: 3 post-baseline measurements =6 one more 5 year grant cycle To detect a hypothesized � 20% difference in cognitive decline by current hormone use with 90% power? CS, DEX : 6 post-baseline measurements =12 More than two 5 year grant cycles N=15,000 for these calculations

40. How many more measurements should be taken in four (1 NIH grant cycle) and eight years of follow-up (two NIH grant cycles)... To detect the observed 70% difference in cognitive decline by GRAD with 90% power? To detect a hypothesized � 20% difference in cognitive decline by current hormone use with 90% power?

41. Optimize (N,r) in a new study of cognitive decline Assume 4 years of follow-up (1 NIH grant cycle); cost of recruitment and baseline measurements are twice that of subsequent measurements GRAD: (N,r)=(26,795; 1) CS =(26,930;1) DEX =(28,945;1) RS CURRHORM: (N,r)=(97,662; 1) CS =(98,155; 1) DEX =(105,470;1) RS

42. Conclusions

43. CMD: If all observations have the same cost, one would not take repeated measures. If subsequent measures are cheaper, one would take no repeated measures or just a small number if the correlation between measures is large. If deviations from CS exist, it is advisable to take more repeated measures. Power increases as and as Power increases as Var( ) goes to 0

44. LDD: If the follow-up period is not fixed, choose the maximum length of follow-up possible (except when RS is assumed). If the follow-up period fixed, one would take more than one repeated measure only when the subsequent measures are more than five times cheaper. When there are departures from CS, values of ? around 10 or 20 are needed to justify taking 3 or 4 measures. Power increases as , as , as slope reliability goes to 0, as Var( ) increases, and as the correlation between and exposure goes to 0

45. LDD: The optimal (N,r) and the resulting power can strongly depend on the correlation structure. Combinations that are optimal for one correlation may be bad for another. All these decisions are based on power considerations alone. There might be other reasons to take repeated measures. Sensitivity analysis. Our program.

47. Thanks!

Designing longitudinal studies in epidemiology

Designing longitudinal studies in epidemiology

Presentation Transcript

Introduction to longitudinal studies in the UK

CVD Epidemiology Case Studies

TYPES OF STUDIES IN DIABETES EPIDEMIOLOGY

Designing quantitative studies

CVD Epidemiology Case Studies

Designing longitudinal studies in epidemiology

Designing Studies

Longitudinal Studies of Children’s Intelligence

Longitudinal Studies

Longitudinal Beam Dynamics Studies in EMMA

Designing longitudinal studies in epidemiology

Managing Code Lists in Longitudinal Studies

The UK Census Longitudinal Studies

Longitudinal Impedance Studies of VMTSA

Designing Case Studies

Further Studies on longitudinal dynamics in the CR

Designing user studies

Considerations of Multicenter Studies in Cancer Epidemiology

Epidemiological studies | Case-control studies epidemiology