Design and Analysis of Clinical Study 10. Cohort Study

Design and Analysis of Clinical Study 10. Cohort Study Dr. Tuan V. Nguyen Garvan Institute of Medical Research Sydney, Australia

Uses of Cohort Study • Identification of risk factors (or prognostic factors)? • Uses of risk factors (or prognostic factors) • Relation of risk measures to risk factors • “It is estimated that smoking is responsible for 100,000 deaths from lung cancer annually”.

Risk • Risk = • Prospective chance (probability) • Rate of occurrence (incidence) of a health- related event • Measures of risk • Incidence density (including mortality rate) • Cumulative incidence(including "attack rate") • Case-fatality rate • Survival rate

Criteria to be Fulfilled in Cohort Studies • Observation must take place over a meaningful period of time • All members of the cohort must be observed. Drop-outs distort the study.

Types of Cohort Present Future Past Concurrent Cohort Followed into future Assembled now Assembled from past records Historical Cohort Followed till now Exposure and outcome Retrospective Cohort

Analysis of Cohort Studies Exposed Time Diseased (n=39) Healthy n = 30 000 Not exposed Diseased (n=6) Healthy n = 60 000

Modifiable risk factors Non-modifiable risk factors -Medication: Corticosteroids - Bone-related factors: BMD, bone strength indice… - Fall and fall-related factors - Prior fracture - Lifestyle: smoking, alcohol - Advancing age - Family history - Genetics Fracture Intervention strategies Identify high-risk group

Prospective Cohort Study

Prospective Cohort Study • Basal cohort(s) • Sampling from defined population, or • Stratified assembly, or • Matched assembly • Observation for defined period under specified observational protocol • Time of data collection: prospective vs. Retrospective cohort studies

Prospective Cohort Study

Factors in Prospective cohort study • Event (e.g. disease) • Person at risk, population at risk • Person-years

Population at risk (N=200)        

Week 1         O O

Week 2 O         O O O O

Week 3 O         O O O O O O

Person-time • Person-time = # persons x duration 1 o 2 x 2 4 3 4 4 8 5 2 0 2 4 6 8 Time (week) Incidence rate (IR). During (2+4+4+8+2)=20 person-years, there were 2 incident cases: IR = 2/20 = 0.1

Estimation of Incidence Rates • Consider a study where P patient-years have been followed and N cases (eg deaths, survivors, diseased, etc.) were recorded. • Assumption: Poisson distribution. • The estimate of incidence rate is: I = N / P • Standard error of I is: • 95% confidence interval of “true” incidence rate: I+ 1.96 x SD(I)

Relative Risk • Relative risk (RR): Incidence rate of ischemic heart disease (IHD) <2750 kcal >2750 kcal ______________________________________________________________ Person-years 1858 2769 New cases 28 17 ______________________________________________________________ Estimate rate 15.1 6.1 SD of est. rate 2.8 1.5 • L = log(RR) = 0.908 • Standard error of log(RR) • 95% of L: L ±1.96xSE = 0.908 ± 1.96x0.3075 = 0.3055, 1.51 • 95% of RR: = exp(0.3055), exp(1.51) = 1.36, 4.53

Analysis of Difference in Incidence Rates • Difference: D = 15.1 – 6.1 = 8.93 Incidence rate of ischemic heart disease (IHD) <2750 kcal >2750 kcal ______________________________________________________________ Person-years 1858 2769 New cases 28 17 ______________________________________________________________ Estimate rate 15.1 6.1 SD of est. rate 2.8 1.5 • Standard error (SE) of D • 95% of D = D ±1.96xSE = 8.93 ± 1.96x0.032 = 3.65, 14.2

Logistic Regression Analysis using R fracture <- read.table(“fracture.txt”, header=TRUE, na.string=”.”) attach(fulldata) results <- glm(fx ~ bmd, family=”binomial”) summary(results) Deviance Residuals: Min 1Q Median 3Q Max -1.0287 -0.8242 -0.7020 1.3780 2.0709 Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) 1.063 1.342 0.792 0.428 bmd -2.270 1.455 -1.560 0.119 (Dispersion parameter for binomial family taken to be 1) Null deviance: 157.81 on 136 degrees of freedom Residual deviance: 155.27 on 135 degrees of freedom AIC: 159.27

Incidence Density Number of new casesID = ––––––––––––––––––– Population time Number of new cases 7ID = ––––––––––––––––––– = –––––– Population time ?

Incidence Rate • Population time at risk: • 200 people for 3 weeks = 600 person-wks • But 2 people became cases in 1st week • 3 people became cases in 2nd week • 2 people became cases in 3rd week • Only 193 people at risk for 3 weeks

Incidence Rate • Population-time: • 2 people who became cases in 1st week were at risk for 0.5 weeks each = 2 @ 0.5 = 1.0 • 3 people who became cases in 2nd week were at risk for 1.5 weeks each = 3 @ 1.5 = 4.5 • 2 people who became cases in 3rd week were at risk for 2.5 weeks each = 2 @ 2.5 = 5.0 • Non-cases = 193 @ 3 = 579 • TOTAL POPULATION – TIME = 589.5 Person-weeks • Incidence rate: 7 ID = –––––– = 0.0119 cases / person-wk 589.5 average over 3 weeks

Incidence Proportion Number of new casesCI = ––––––––––––––––––– Population at risk 73-week CI = –––– = 0.035 200

Summary of Cohort Study’s Results Relative risk (RR) = I1 / I2

Person-time • Person-time = # persons x duration 1 2 2 4 3 4 4 8 5 2 0 2 4 6 8 Time Incidence rate (IR). During (2+4+4+8+2)=20 person-years, there were 2 incident cases: IR = 2/20 = 0.1

Incidence

Estimation of Incidence Rates • Consider a study where P patient-years have been followed and N cases (eg deaths, survivors, diseased, etc.) were recorded. • Assumption: Poisson distribution. • The estimate of incidence rate is: I = N / P • Standard error of I is: • 95% confidence interval of “true” incidence rate: I+ 1.96 x SD(I)

Relative Risk • Relative risk (RR): Incidence rate of ischemic heart disease (IHD) <2750 kcal >2750 kcal ______________________________________________________________ Person-years 1858 2769 New cases 28 17 ______________________________________________________________ Estimate rate 15.1 6.1 SD of est. rate 2.8 1.5 • L = log(RR) = 0.908 • Standard error of log(RR) • 95% of L: L ±1.96xSE = 0.908 ± 1.96x0.3075 = 0.3055, 1.51 • 95% of RR: = exp(0.3055), exp(1.51) = 1.36, 4.53

Analysis of Difference in Incidence Rates • Difference: D = 15.1 – 6.1 = 8.93 Incidence rate of ischemic heart disease (IHD) <2750 kcal >2750 kcal ______________________________________________________________ Person-years 1858 2769 New cases 28 17 ______________________________________________________________ Estimate rate 15.1 6.1 SD of est. rate 2.8 1.5 • Standard error (SE) of D • 95% of D = D ±1.96xSE = 8.93 ± 1.96x0.032 = 3.65, 14.2

Logistic Regression Analysis using R fracture <- read.table(“fracture.txt”, header=TRUE, na.string=”.”) attach(fulldata) results <- glm(fx ~ bmd, family=”binomial”) summary(results) Deviance Residuals: Min 1Q Median 3Q Max -1.0287 -0.8242 -0.7020 1.3780 2.0709 Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) 1.063 1.342 0.792 0.428 bmd -2.270 1.455 -1.560 0.119 (Dispersion parameter for binomial family taken to be 1) Null deviance: 157.81 on 136 degrees of freedom Residual deviance: 155.27 on 135 degrees of freedom AIC: 159.27

Dubbo Osteoporosis Epidemiology Study • 1989 – 1993: • Recruit 3000 individuals • Measure bone mineral density (BMD) • Classified BMD into normal and osteoporosis • 1989 – 2005: • Record the number of fractures • Analysis of association between BMD and fracture

Dubbo Osteoporosis Epidemiology Study 1287women Low BMD 345 (27%) Not Low BMD 942 (73%) Fx = 137 (40%) No Fx = 208 (60%) Fx = 191 (20%) No Fx = 751 (80%) 42%

Direct calculation of incidence Time sequence can be established Different outcomes for one agent can be determined Advantages and Disadvantages of Cohort Studies Disadvantages Advantages • Large numbers to be measured over a long time • Subclinical disease may escape diagnosis

Design and Analysis of Clinical Study 10. Cohort Study

Design and Analysis of Clinical Study 10. Cohort Study

Presentation Transcript

Design and Analysis of Clinical Study 12. Randomized Clinical Trials

Cohort Study

Design and Analysis of Clinical Study 8. Cross-sectional Study

Design and Analysis of Clinical Study 6. Case-control Study

COHORT STUDY

Cohort study

Retrospective Cohort Study and Prospective Cohort Study

Cohort Study

Cohort Study

Cohort Study

Design and Analysis of Clinical Study 11. Analysis of Cohort Study

Design and Analysis of Clinical Study 12. Meta-analysis

Cohort study

Clinical Study Design

Design and Analysis of Clinical Study 9. Analysis of Cross-sectional Study

Cohort Study

Analysis and Presentation of Cohort Study Data

Design and Analysis of Clinical Study 7. Analysis of Case-control Study

COHORT STUDY

Cohort Study Analysis

Retrospective Cohort Study and Prospective Cohort Study