410 likes | 645 Views
Main Points to be Covered. Cumulative incidence using life table method Difference between cumulative incidence based on proportion of persons at risk and incidence rate based on person-time Calculating person-time incidence rates Uses of person-time incidence rates
E N D
Main Points to be Covered • Cumulative incidence using life table method • Difference between cumulative incidence based on proportion of persons at risk and incidence rate based on person-time • Calculating person-time incidence rates • Uses of person-time incidence rates • Relation of prevalence to incidence • Odds versus probability
Two assumptions in survival analysis • Censoring is unrelated to survival (unrelated to the probability of experiencing the outcome) • There are no temporal trends in the probability of the outcome
Long-Term Survival Data May Be Invalid Due to Temporal Trends Paper in Current Lancet analyzes data from National Cancer Institute’s Follow-up of Diagnoses 1978 –1998: Overall survival cohort method = 40% Overall survival with period analysis allowing for Changes in survival over time = 51% Brenner, The Lancet, Oct 12, 2002
Cumulative incidence: Life table • No exact times of events or censoring needed • Assume events and censoring occurred uniformly during the fixed time intervals (uniformity assumption) • Therefore assume on average each censored person at risk for half of the time period • Subtract one-half of subjects lost during interval from denominator at interval beginning • Calculations just like Kaplan-Meier
Time interval fixed length Time intervals usually same (not required) Assume uniform timing of censoring Calculate probability of surviving the interval Cumulative incidence = (1 – product of interval survival probabilities) Time interval based on time of events Time intervals vary (not required) No assumption required about timing of censoring Calculate probability of surviving the interval Cumulative incidence = (1 – product of interval survival probabilities) Life Table vs. Kaplan-Meier
Life Table: Example in Text • Szklo and Nieto use the same example of 10 observations to illustrate life table and KM • Life table uniformity assumption not valid • Life table more commonly used on large secondary data sets where exact failure times are not known • With very large numbers the uniformity assumption is more likely to be valid
Life Table: Primary Biliary Cirrhosis Survival Data . ltable time d Interval Total Deaths Lost Survival SE [95% Conf. Int.] ------------------------------------------------------------------------------ 0 - 1 184 18 7 0.9003 0.0223 0.8464 0.9360 1 - 2 159 19 9 0.7896 0.0308 0.7214 0.8429 2 - 3 131 13 9 0.7084 0.0349 0.6337 0.7707 3 - 4 109 9 12 0.6465 0.0375 0.5679 0.7145 4 - 5 88 14 7 0.5394 0.0407 0.4563 0.6153 5 - 6 67 7 14 0.4765 0.0424 0.3915 0.5565 6 - 7 46 10 12 0.3574 0.0455 0.2694 0.4461 7 - 8 24 3 4 0.3086 0.0472 0.2193 0.4022
Calculating a Life Table Cumulative Int.TotalDLostN At-riskP EventSurvival 0-1 184 18 7 184 – 7/2=180.5 18/180.5=0.0997 0.9003 Subtract ½ of lost during interval from denominator 1-2 159 19 9 159 – 9/2=154.5 19/154.5=0.1229 0.7896 Repeat for next interval and so forth 2-3 131 13 9 131 – 9/2=126.5 13/126.5=0.1028 0.7084
Interval Total Deaths Lost Survival SE [95% CI Int.] 0 - 1 184 18 7 0.9003 0.0223 0.8464 0.9360 1 - 2 159 19 9 0.7896 0.0308 0.7214 0.8429 2 - 3 131 13 9 0.7084 0.0349 0.6337 0.7707 3 - 4 109 9 12 0.6465 0.0375 0.5679 0.7145 4 - 5 88 14 7 0.5394 0.0407 0.4563 0.6153 Time Total Fail Lost Survival Function SE [95% CI Int.] 1.038 159 0 1 0.9011 0.0221 0.8475 0.9365 2.034 131 1 0 0.7846 0.0310 0.7162 0.8385 3.001 109 1 0 0.7025 0.0352 0.6273 0.7653 4.104 88 1 0 0.6397 0.0378 0.5605 0.7083 5.15 67 1 0 0.5327 0.0409 0.4495 0.6090
The Three Elements in Measures of Disease Incidence • E = an event = a disease diagnosis or death • N = number of persons in the population in which the events are observed • T = time period during which the events are observed
E/N E/T E/NT E
Two Measures Described as Incidence in the Text • The proportion of individuals who experience the event in a defined time period (E/N during some time T) = cumulative incidence • The number of events divided by the amount of person-time observed (E/NT) = incidencerate or density (not a proportion)
Person-Time Incidence Rates • The numerator is the same as incidence based on proportion of persons = events (E) • The denominator is the sum of the follow-up times for each individual • The resulting ratio of E/NT is not a proportion; may be greater than 1; value depends on unit of time used
rates: year 1 = 3/7.083 = 42.4/100 person-years year 2 = 3/2.50 = 120/100 person-years both yrs 6/9.583 = 62.6/100 person-years
We have been calculating average rates; rate is often instantaneous change in one measure with respect to a second measure as interval 0 death rate Population size time In disease, the occurrence rate is often called a hazard or the force of morbidity (mortality)
Rates • We are used to rates being change in a measure with respect to time but time does not have to be involved • Accidents per passenger-mile, for example, is often used in transportation • Economics often uses rates in which time is not an element (eg, energy use per unit of gross national product)
Comparison of cumulative incidence and incidence rate (density) • Kaplan-Meier cumulative incidence estimate for these data was (1 - 0.18) = 0.82 (ie, 82% of persons will experience event in a two-year period) • Two-year incidence density is 62.6 / 100 person-years or 0.626 per person-year • Not a proportion--if calculated per person-days, rate would be 0.17 / 100 person-days
Incidence rate (density) value depends on the time units used An incidence rate of 100 cases per 1 person-yr: • 100 cases/person-year • 10,000 cases/person-century • 8.33 cases/person-month • 1.92 cases/person-week • 0.27 cases/person-day • Note: time period during which rate is measured can differ from the units used
Person-time incidence based on grouped vs. individual data • Szklo and Nieto use rate when based on group data and density when based on individual data (not followed by most) • Total person-time for grouped data is based on the time interval x the average population at risk during the interval • Assumes uniform occurrence of events and of censoring during the interval (like life table)
Calculating person-time incidence using grouped data • Use average number of persons at risk • In the text example, start with 10 persons, 6 die and 3 are lost to follow-up • Subtract 0.5 x (6 + 3) from 10 = 5.5 • uniformity assumption as in life tables • Total person-time is 5.5 x 2-years = 11 person-years. 6 events, so rate 6/11= 0.545 = 54.5 per 100 person-years (compare to 62.6 when calculated using individual data)
Incidence from grouped data • Most commonly used for large secondary data sets where precise information on occurrence of events and on persons leaving and entering population are not available • eg, annual cancer mortality rates per 100,000 population ( = per 100,000 person-years) • If times of events and of censoring available, would normally use individual level data
Group data rates versus individual data rates • Differ depending on how close events and losses are to occurring uniformly • If losses perfectly uniform, they are the same • Analogous to life table assumption of uniform timing of losses versus Kaplan-Meier use of individual data
Individual calculation: 2 deaths / 5 pers-yrs = 0.4 per pers-yr Group data: average population = (4 + 1) / 2 = 2.5 rate = 2 / 2.5 x 2 = 0.4 pers-yr
Rates based on group data • Uniformity of events and losses likely to be approximately true for large secondary data sets • Rates using secondary data sets on free-living populations assume new members and losses balance out (= approx. stable) • Important for the use of population reference rates (eg, expected mortality in U.S. population)
Calculating Rates in STATA A few STATA survival analysis and rate commands: Declare data set survival data: . stset timevar, fail(failvar) . ltable timevar, graph gives life table analysis & graph .strate gives person-years rate .strate groupvar gives rates within groups
Immediate Commands in STATA STATA has an option to use it like a calculator for various computations without using a data set. Called immediate commands. Example, to calculate the confidence interval around a person-time rate: . cii #person-time units #events, poisson Eg. 6 events occur in 10 person-years of follow-up . cii 10 6, poisson 95% CI = 0.220 – 1.306
Assumption of Person-Time Incidence Estimation • T time units of follow-up on N persons is the same as N time units on T persons • Observing 2 deaths in 2 persons followed for 50 years gives the same incidence rate as 2 deaths in 100 persons followed 1 year • Assumption is not reasonable if sample sizes and follow-up times differ greatly
Assumption of Person-Time Incidence Estimation • If looking at relationship between exposure and outcome rate, one rate for a follow-up period implies exposure does not have cumulative effect on probability of event over time • Clearly false for exposures with cumulative effects like length of time smoking
Why use person-time rather than cumulative incidence? • Rates using group data can be calculated in open populations from a variety of data sources where population sizes are estimated • Incidence rates from a cohort study can be compared to standardized rates from the general population to obtain ratio measures called standardized mortality ratio (SMR) or standardized incidence ratio (SIR)
Why use person-time rather than cumulative incidence? • If E is a recurrent event, rate may seem more natural. • For example, cumulative incidence of episodes of the common cold, would have to be done separately for each (ie, proportion with 1st cold, proportion with 2nd cold given that you have had 1, etc.).
Calculating stratified person-time incidence rates in cohorts • For persons followed in a cohort some potential risk factors may be fixed but some may be variable • eg, ethnicity is fixed; occupational exposure to asbestos can change over time with the job • Total person-time in an exposure category is one way to deal with risk factors that change over time
Relation of Prevalence and Incidence • Prevalence is a function of incidence and duration of disease by the equation: point prevalence = incidence x duration x (1 - point prevalence) [P = I x D (1 - P)] • For many typically low prevalence diseases prevalence becomes approximately I x D since (1 - P) is close to 1 if P is very low
Prevalence and Etiology • Because prevalence depends both on incidence and duration of disease, it is not a good measure for etiological studies • Cannot examine the determinants of occurrence alone when you have to account for determinants of duration (Rx, etc.) • Etiologic study designs should avoid sampling prevalent cases of disease or prevalent controls
Summary Points • Person-time incidence rate or density is not equivalent to cumulative incidence and is not a proportion • Person-time incidence rate can be calculated with group or individual data • Allows comparison with population reference rates from other data sources • Allows accumulation of time at risk for different strata
Odds versus Probability • Odds based on probability; expresses probability (p) as ratio: odds = p / (1 - p) • odds is always > p because divided by < 1 • For example, if probability of dying = 1/5, then odds of dying = 1/5 / 4/5 = 1/4 • Thinking of odds as 2 outcomes, the numerator is the # of times of one outcome and the denominator the # of times of the other • P = odds / (1 + odds), so 1/4 / 1 + 1/4 = 1/5
Odds versus Probability • Less intuitive than probability (probably wouldn’t say “my odds of dying are 1/4”) • No less legitimate mathematically, just not so easily understood • Used in epidemiology primarily because the log of the ratio of two odds is given by the coefficients in logistic regression equations