EPI-820 Evidence-Based Medicine

EPI-820 Evidence-Based Medicine LECTURE 8: PROGNOSIS Mat Reeves BVSc, PhD

Objectives: • 1. Review definitions. • 2. Understand concept of natural history and inception cohort studies. • 3. Define commonly used measures of prognosis. • 4. Understand origins of bias in follow-up studies. • 5. Understand basic statistical methodology for survival data. • 6. Define characteristics of an ideal prognostic study. • 7. Understand the rationale and development of clinical decision/prediction rules

“Prediction is very difficult, especially about the future” -Niels Bohr

I. Definitions • Prognosis: the prediction of the future course of events following the onset of disease. • can include death, complications, remission/recurrence, morbidity, disability and social or occupational function. • Prognostic factors: factors associated with a particular outcome among disease subjects. • examples includes age, co-morbidities, tumor size, severity of disease etc. • often different from disease risk factors e.g., BMI and pre-menopausal breast CA.

I. Definitions • Natural history: the evolution of disease without medical intervention. • Clinical course: the evolution of disease in response to medical intervention.

Natural History Studies • Degree to which natural history can be studied depends on the medical system and the type of disease. • The natural history of some diseases can be studied because: • remain unrecognized (i.e., asymptomatic) e.g., anemia, hypertension. • considered “normal” discomforts e.g., arthritis, mild depression.

Natural History Studies • Natural history studies permit the development of rational strategies for: • early detection of disease • e.g., CIN and Invasive Cervical CA. • treatment of disease • e.g., middle ear infections. • DCIS • Hypertension

A. Study Designs • Measuring natural history or clinical course of disease requires a cohort study: • Most often use a retrospective cohort. • Exposed group = affected (diseased) patients followed to measure outcomes of interest.

Cohort Study Design • To determine if outcome is atypical, need to compare to a non-exposed group which should be: • obtained from the same source population that gave rise to the patients. • monitored with the same intensity. • If unaffected cohort unavailable, use “outside” data e.g., standardized incidence or mortality ratios for cancer studies.

B. The Inception Cohort • Prognosis studies usually involve taking a sample of diseased patients. • Such cohort studies are very susceptible to bias because clinical course of disease can be both prolonged and variable • Key factor in determining disease outcome is when did the time clock start?

Starting Point • Starting point must be well defined and clearly specified: • The same point in the course of the disease for all individuals. • Ideal starting point is near the onset (“inception”) of disease = inception cohort. • Usually it is: • onset of symptoms, time at first diagnosis, or beginning of treatment.

Effect of Using a Defined Inception Cohort on Average Survival Time

II. Bias in Follow-up Studies • A. Selection or Confounding Bias • i) Assembly or susceptibility bias occurs when the exposed and non-exposed groups differ other than by the prognostic factors under study, and • the extraneous factor affects the outcome of the study. • Examples: • differences in starting point of disease (survival cohort) • differences in stage or extent of disease, co-morbidities, prior treatment, age, gender, or race.

Survival Cohorts • Survival cohort (or available patient cohort) studies can be very biased because: • convenience sample of current patients are likely to be at various stages in the course of their disease. • individuals not accounted for have different experiences from those included e.g., died soon after trt. • Not a true inception cohort e.g., retrospective case series. • Very common!

Survival Cohorts (Fletcher) Observed Improvement True Improvement True Cohort Assemble Cohort N=150 Measure Outcomes Improved = 75 Not improved = 75 50% 50% Survival Cohort Assemble patients Begin Follow-up N = 50 Measure Outcomes Improved = 40 Not improved = 10 80% 50% Not Observed N = 100 Dropouts: Improved = 35 Not improved = 65

A. Selection Bias • ii) Migration bias • occurs when patients drop out of the study • (lost-to-follow-up). • usually subjects drop out because of a valid reason • e.g., died, recovery, side effects or disinterest. • these factors are often related to prognosis. • asses extent of bias by using a best/worst case analysis. • patients can also cross-over from one exposure group to another • if cross-over occurs at random = non-differential misclassification of exposure

A. Selection Bias • iii) Generalizability bias • related to the selective referral of patients to tertiary (academic) medical centers. • highly selected patient pool have different clinical spectrum of disease. • influences generalizability (see Moltusky, 1978; Melton, 1985).

II. Bias in Follow-Up Studies • B. Measurement bias • Measurement (or assessment) bias occurs when one group has a higher (or lower) probability of having their outcome measured or detected. • likely for softer outcomes • side effects, mild disabilities, subclinical disease or • the specific cause of death.

B. Measurement bias • Measurement bias can be minimized by: • ensuring observers are blinded to the exposure status of the patients. • using careful criteria (definitions) for all outcome events. • apply equally rigorous efforts to ascertain all events in both exposure groups.

III. Commonly Used Measures of Prognosis • A. 5-year Survival Rate • Number of individ. who survived between t 0 and t+5 years Number of individuals with disease at t 0 • typically t 0 is the point of initial diagnosis or treatment. • cumulative incidence rate (= risk of death at 5 years). • measures the proportion of the original patient population alive at 5 years. • easy to interpret and remember, but fails to indicate the rate of death

Fig. Limitation of 5-year Survival Rates(From Fletcher)

B. Case-fatality Rate (CFR) • CFR = Number of indv. who die during t0 to t+1 Number of individuals with disease at t0 • specific type of cumulative incidence rate (= proportion, not a rate). • measures the risk of death among those individuals who develop disease. • must have an explicit (or implicit) time period that is sufficiently long to ensure that all relevant events have been observed e.g., legionnaires= disease, stroke, CML. • Same principles apply to response, remission, and recurrence rates.

C. Mortality or Death Rate (MR) • MR = Number of indv. who die during t 0 to t+1 Total population time-at-risk during t 0 to t+1 • defined as the incidence rate of death per "population time" • denominator is population time-at-risk. • measures the speed of death due to a specific disease • distinguish from the case-fatality rate! Example: • Stroke 28 day CFR = 23% • Stroke MR = 63/100,000

IV. Statistical Methods Used in Prognosis Studies • A. Analysis of Survival or Failure Time Data • primary end point of prognosis studies is time until event of interest occurs e.g., death or relapse. • analysis of survival or failure time data, requires specific techniques: • Kaplan-Meier estimator • Log-rank test • Cox proportional hazardsregression model.

Censoring: • Defn: when the event of interest does not occur in all individuals because: • study was stopped before everyone in the study had the event • loss to follow-up • death from other (competing) causes e.g., road traffic accidents

Censoring • All statistical methods assume censoring is non-informative: • reason for incomplete observation is not related to the underlying risk of failure. • therefore, survival experience of those “lost to follow-up” is assumed to be the same as those that remain.

Survival function (S(t)) • Defn: the probability of survival to a given point in time (t) • S(3)= 60 indicates that 60% of the population survived 3 years. • Graphically displayed using a "life table" or "survival curve.“ • Median survival time = the time at which half the patients have "failed“ • a crude but common measure of survival

A Typical Survival Curve Showing the Survival Function (S(t)) Plotted Against Time with a Median Survival Time of 1.25 Years

Hazard function (h(t)) • Defn: the probability of an event at a specific moment in time (t), given the patient has already survived to that point in time. • closely linked to the survival function. • indicates the probability of the patient "failing" during the next time period. • a direct measure of prognosis.

Kaplan-Meier Estimator • a widely accepted method of estimating S(t). • S(t) is expressed as the product of conditional probabilities e.g., • S(3)= S(1) x S(2|1) x S(3|2) • where: • S(1)= probability of surviving year 1 • S(2|1)= conditional probability of surviving year 2, given survival to year 1. • S(3|2)= conditional probability of surviving year 3, given survival to year 2.

Kaplan-Meier Estimator • estimators or "curves" begin at time zero with S(t) = 1 and then decrease in a series of steps corresponding to observed times of failure. • censored observations contribute to the survival probability estimates up until the time of censoring. • no assumptions made about the shape of the survival or hazard function (= a non-parametric technique). • variability of estimates is greatest at the ends of the curves - few subjects and few failures.

Kaplan-Meier Estimators of the Survival Function (S(t)) for Two Groups (Treatment and Control).

Log-rank test • a statistical test of the difference in survival distributions (see Peto et al, 1977). • at each observed time of failure, compares the observed number of events in one group to the expected number (based on identical hazard functions for the two groups). • gives equal weight to differences at each point in time. • if it makes sense to place greater emphasis on differences at earlier time periods then use the generalized Wilcoxon test (Cox and Oakes, 1984).

Cox proportional hazard model • Very powerful regression modeling technique based on the hazard function (see Tibshirani 1982). • allows for the full application and flexibility of regression analysis to be applied to survival data. • in its simplest form its an extension of the log rank test. • Advantages: • ability to handle a large number of prognostic variables (both discrete and continuous) • can adjust for confounding variables, and • evaluate interaction effects

B. Statistical Control of Common (Selection) Biases • Prognostic studies are essentially observation studies that focus on survival (or some other outcome). • Techniques to control biases are therefore the same as used in observational epidemiology.

Phase of Study Methods Description Design Analysis Randomization Random assignment ensures that known and unknown confounders are equally distributed between exposure groups (this is rarely feasible however, unless a specific RCT designed to evaluate some aspect of prognosis is being conducted). + Restriction If a strong confounding factor is known - such as age or sex - limit the range of the characteristics of patients in the study. + Matching Match exposure groups on the basis of important prognostic variables - such as stage of disease, age or sex. + Stratification Compare event rates within subgroups (strata) with otherwise similar probability of outcomes e.g., sex or age-groups specific rates. + Table. Methods for Controlling Selection Bias (from Fletcher)

Adjustment Procedures Design Analysis Simple Mathematically adjust crude rates for a characteristic known to be an important prognostic factor e.g., age adjustment. + Multiple Use mathematical models to adjust risk estimates for several prognostic variables (Cox Regression). + Sensitivity Analysis Describe how the results could differ by changing the values of known prognostic factors over plausible ranges. Best/worst case analysis is an example. + Table. Adjustment Procedures to Control Selection Bias

V. Ideal Characteristics of Prognostic Studies • 1. Was the sample well defined and representative of a definable underlying population? Was the referral pattern well described? • 2. Was an inception cohort assembled? Were all the study patients at a similar well defined point in the course of their disease? • 3. Was the follow-up complete and sufficiently long?

V. Ideal Characteristics of Prognostic Studies • 4. Were objective and unbiased outcome criteria used? • 5. Was the outcome assessment blind? • 6. Was adjustment for extraneous prognostic factors carried out?

Editorial Readings • Melton • What is selection bias? and how can it effect the conclusions of studies? • Motulsky • Why did author place such emphasis on understanding the selection method?

VI. Clinical Decision Rules (CDR) • clinical tools that combine history, physical examination, and simple diagnostic tests to aid in diagnostic, prognostic or treatment decisions • Outcomes: • Probability of disease/event (risk) • e.g., APGAR, APCHE, CVD Risk Prediction (Framingham), colic prognosis • Diagnostic/treatment decision • e.g., Breast biopsy decisions, Breast CA risk (Gail model), colic surgery

CDRs – 3 Step Development Process • Step 1 - Derivation: • Identify important (predictive) variables • Use statistical methods (Logistic regression, recursive partitioning, neural networks), or pick variables based on expert opinion • Initial statistical testing (validation) • Split sample (development and training sets) • Bootstrap techniques

CDRs – 3 Step Development Process • Step 2 – Validation • Usually prospective, validation required because • CDR accuracy may be specific to development population (because of severity & disease prevalence) • CDR may not be applied in the same manner in other populations • Narrow: Application to similar patient popl. • Broad: Application to different populations with varying prevalence and disease spectrum.

CDRs – 3 Step Development Process • Step 3 – Impact Analysis • Required because reluctance to use CDR is common. Why? • Concern about different patient population/settings • Risk of false negatives (esp. legal concerns) • Rules are complicated or take too long to use • Doesn't provide a course of action (just a probability!!) • Test effect of CDR on physician behavior and clinical practice and patient outcomes • Rarely done! • Ideal = Randomize individual patients (difficult) or practices. • Or, evaluate using pre – post design

CDR - Hierarchy of Evidence

CDR – Methodological Standards(McGinn, JAMA 2000) • Were all important predictors included in the derivation process • = content validity • were they collected prospectively in a blinded fashion for the purposes of CDR development? • every patient included?, minimal missing data? • were predictors present in large proportion of study pop • Was the patient population and setting well defined? • age, sex, referral filter • Were all outcome events clearly defined? • Are they of clinical importance? • Were they determined blindly (independent of predictors)

CDR – Methodological Standards(McGinn, JAMA 2000) • Were appropriate statistical methods used? • Adequate sample size to avoid over-fitting (need 10 outcomes per variable) • Were results of CDR appropriate and clear? • Se/Sp or ROC curves– usually want high Se (to avoid FNs) • PV’s of more use to clinicians (Prevalence dependent) • LR’s? • Prob (outcome) = survival curves • Was reproducibility of predictors and the rule itself assessed? • Many S/S are not very reliable • Concerned with inter-observer variability (K)

EPI-820 Evidence-Based Medicine