540 likes | 969 Views
The Early Detection of Disease –Statistical Challenges. Marvin Zelen Harvard University The R.A. Fisher Memorial Lecture August 1, 2007 Joint Statistical Meetings Salt Lake City, Utah . Outline. Background and Motivation Statistical Challenges The Early Detection Process
E N D
The Early Detection of Disease –Statistical Challenges Marvin Zelen Harvard University The R.A. Fisher Memorial Lecture August 1, 2007 Joint Statistical Meetings Salt Lake City, Utah
Outline • Background and Motivation • Statistical Challenges • The Early Detection Process • Applications Breast Cancer : Screening with mammography Do women under 50 benefit? --Controversial Public Health Programs– U.S., U.K. and Nordic countries have different recommendations ---- tradeoffs? Prostate Cancer : Probability of Over Diagnosis
Background and Rationale • Screening Programs : Special exams to diagnose disease when it is asymptomatic. • Motivation: Diagnosing and treating the disease early, before signs/symptoms appear, may result in more cures and lower mortality.
Examples of Screening Programs • Tuberculosis • Hypertension • Diabetes • Coronary Artery Disease • Cancer • Thyroid Disease • Breast Cancer • Osteoporosis • Cervical Cancer • HIV • Colorectal Cancer • Lung Cancer • Prostate Cancer
Scientific Evidence of Screening Benefit • Diagnosing disease early does not necessarily result in benefit; e.g. diagnosing a primary cancer earlier may not be of benefit if the disease has already metastasized. • A necessary condition for benefit by early detection requires that the disease tends to be diagnosed in an earlier stage • If an effective treatment does not exist, there is no benefit in diagnosing disease early. • The general consensus is that randomized clinical trials are the only way to evaluate screening programs for potential benefit.
Some Statistical Challenges • Planning early detection clinical trials Early detection clinical trials are different from therapeutic trials. Power depends on number of exams and time between exams. There exists an optimum time for follow up and analysis. • Public Health Programs : Recommendations Initial age to begin screening, intervals between exams, high risk individuals. Recommendations should be made by risk status. --- Costs may be an important consideration. • Over diagnosis Disease may be diagnosed early, but may never evolve clinically in a person’s lifetime. Important to estimate probability of over diagnosis?
Early Detection Randomized Clinical Trials • Typical trial consists of two groups . One group (control) receives usual care; the other group (study group) receives invitation to have a finite number of special examinations. • Follow up for disease occurrence and death continues after the last exam. • Endpoint is death from disease. • Randomization may be carried out on an individual basis or by cluster randomization ; e. g. geographical region, physician practice.
Early Detection vs. Therapeutic Trials Statistical Problem : Design of Early Detection Clinical Trials. -- How many subjects, exams, exam spacing, follow up and optimal analysis time, etc.
Early Detection Clinical Trials • Only subjects who are diagnosed with disease carry information about benefit. • Trials need very large number of subjects • Relatively low incidence is characteristic of many chronic diseases; e.g. female breast cancer incidence is about 80-100 per 100,000 women per year depending on age. • Typical trial will require 10-20 years. During this time the technology for diagnosing disease may have changed. • Conclusions may be of limited interest. • Statistical challenge: Is it possible to carry out an early analysis, with limited follow up time?
Public Health Programs • Screening Program : Schedule of exams usually composed of : (1) age to begin screening exams, (2) Intervals between exams and (3) possibly the age to end exams. • Positive screening exam would motivate a more definitive exam (e.g. biopsy). • Costs of a public health screening program may be very large. • Statistical challenge : How does one optimize public health screening programs? There are too many variables to carry out clinical trials to find optimal schedules.
Example : Breast Cancer Screening Using Mammography • The American Cancer Society recommends that annual screening begin at age 40 for women at average risk. Costs of a screening mammogram range from $100-150. (70 M women over the age of 40 in U.S.) Cost would be in billions of dollars if a significant number of women complied. • United Kingdom : The National Health Service offers screening beginning at age 50 with three year intervals for subsequent exams. • Nordic countries : The recommendation is that screening begin at age 50 with two year intervals for subsequent exams. • Statistical challenge : How to choose appropriate public health programs based on risk.
Over Diagnosis • It is possible for some diseases to be diagnosed early which would never have clinical symptoms in a person’s lifetime. • Ordinarily the disease is treated when diagnosed; it is not known whether the disease may exhibit clinical symptoms during a person’s lifetime. • Statistical challenge: Estimate the probability of over diagnosis.
Need for Models • Issues in the previous slide (optimal schedules, over diagnosis) cannot be addressed by RCT’s. • Too many variables, takes too long, too costly , ethical concerns. • Issues may be addressed by models • The need for stochastic models is the principal statistical challenge in the theory and practice of early detection of disease.
Models S0: Disease free state : Does not have disease or has disease which cannot be detected by exam. Sp: Pre-clinical state: Has disease but no signs or symptoms; capable of being detected by exam. Individual is asymptomatic. Sc: Clinical state : diagnosis by usual care. S0 Sp Sc : Progressive disease model (Breast cancer) Sp S0 Sc:Progressive disease model : subgroup Sp never goes on to clinical disease (Prostate cancer) S0 Sp Sc : Non-progressive diseasemodel (HPV ,Cervical cancer)
Issues in the interpretation of data • Suppose a group of patients undergo screening for a particular disease and a number of subjects are diagnosed and treated. • The subjects in this screened group have longer survival than a control group (no screening). Is this scientific evidence of the benefit of screening? • No ---- Length biased sampling and lead time bias may introduce significant biases
Natural History of Progressive Disease Duration of Pre-clinical State |||||||||||||||||||||| Lead Time (forward recurrence time) Age Age of Screening Clinical Inception Point Diagnosis Of disease (Early diagnosis) S0 Sp Sp Sc
Length biased sampling Consider a population of cases Time Screening point • Horizontal line : duration of time in pre-clinical state • Diagnosis : equivalent to placing a random vertical line. Intersection represents case diagnosed. • Vertical line is more likely to intersect longer horizontal lines.
Lead Time Bias :Usual care Age 50 55 60 clinical diagnosis S0 Sp Sp ScDeath Survival from Clinical Diagnosis = 60 – 55 = 5 Years S0 = disease free state, Sp = pre-clinical state Sc = clinical state
S0Sp Death SpSc Early Detection But Survival Is Not Enhanced 53 55 Age 50 60 Screening Point and Diagnosis Diagnosis: usual care Survival from Screening Diagnosis 60 – 53 = 7 Years Survival (with usual care diagnosis) 60 - 55 = 5 years
Dynamics of the Natural History (1) : Usual care Disease States • S0 : Disease free state – disease free or disease state which cannot be detected • Sp: Pre-clinical state - asymptomatic with no signs/symptoms • Sc: Clinical state – when diagnosed by routine methods • Sd: Death state (death due to disease) Disease incidence Death from disease Age x not observed Age x t y S0 Sp Sc Sd Sp Sc Usual care : disease is diagnosed and treated att. .
Dynamics of the Natural History (1) : Usual care Disease States • S0 : Disease free state – disease free or disease state which cannot be detected • Sp: Pre-clinical state - asymptomatic with no signs/symptoms • Sc: Clinical state – when diagnosed by routine methods • Sd: Death state (death due to disease) Disease incidence Death from disease Age x not observed Survival (y– t) Age x t y S0 Sp Sc Sd Sp Sc Usual care : disease is diagnosed and treated att. .
Dynamics with Screening (2) : Exam detected case at tS S0 = disease free Sp = pre-clinical Sc = clinical Sd = death from disease Exam detected y x t Age ts Sd S0 Sp Sp Sc Not observed Disease Interrupted at ts • Ages t and x are not observed. • Treatment begins at tS
Dynamics with Screening (2) : Exam detected case at tS S0 = disease free Sp = pre-clinical Sc = clinical Sd = death from disease (y – ts) Observed Survival Exam detected y x t Age ts Sd S0 Sp Sp Sc • Ages t and x are not observed. • Treatment begins at tS • Observed survival time (y – ts)
Dynamics with Screening (2) : Exam detected case at tS S0 = disease free Sp = pre-clinical Sc = clinical Sd = death from disease (y – ts) Observed Survival Exam detected y x t Age ts Sd S0 Sp Sp Sc Lead Time Lead Time • Ages t and x are not observed. • Treatment begins at tS • Observed survival time (y – ts) • (t – ts )is lead time.
Dynamics with Screening (2) : Exam detected case at tS S0 = disease free Sp = pre-clinical Sc = clinical Sd = death from disease (y – ts) Observed Survival Exam detected y x t Age ts Imputed Survival Sd S0 Sp Sp Sc Lead Time Lead Time Ages and x are not observed. Treatment begins at tS Observed survival time (y – ts) (t – ts )is lead time. Imputed survival = Survival with origin = (observed survival) – ( lead time)
Dynamics with Screening (3) : Exam detected case at tS S0 = disease free Sp = pre-clinical Sc = clinical Sd = death from disease (y – ts) Observed Survival Exam detected y x t t0 t1 … tj-1 … tj … Age ts Imputed Survival Sd Exam times Exam times Sp Sc S0 Sp Lead Time Lead Time Ages and x are not observed. Treatment begins at tS Observed survival time (y – ts) (– ts ) is lead time. Imputed survival = Survival with origin = (observed survival) – ( lead time) There may be a number (unknown) of false negative exams
Dynamics with Screening(3) IntervalCase : Case diagnosed between tr-1 and tr Survival (y - ) t y t x Time ** … tj-1 tj … tr-1 tr t0 t1 S0 Sp Sp Sc Sd Exams at t0 < t1 < … < tr-1
Notes on Modeling Survival begins at point of clinical diagnosis for usual care group (control). In order to make comparisons with control group, all cases in screened group (early diagnosis, interval) must have survival beginning at point of “clinical diagnosis”. This is true for interval cases, but not true for screened diagnosed cases. It is necessary for model to subtract lead time (random variable, not observed) from survival for screened cases so that survival is measured from point of imputed clinical diagnosis (not observed). Screened cases are subject to length biased sampling. This feature must be incorporated in the model.
Applications to Breast and Prostate Cancer Breast Cancer Screening (Mammography) • Benefit for women in their 40’s? • Public Health Programs Choosing screening intervals according to risk. Comparison of U.S., U.K. and Nordic countries Prostate Cancer • Over diagnosis
Data Inputs for Breast Cancer Applications : ( From Clinical Trials) • Mean sojourn time in pre-clinical state varies by age: ♦ age 40: ~ 2 years ♦ age 50 and above: ~ 4 years • Sensitivity varies by age: ♦ age 40: sensitivity ~ 0.7 ♦ age 50 and above: sensitivity ~ 0.9
Screening Younger Women [40, 49] for Breast Cancer Using Mammography • Dispute whether women in their 40’s benefit from screening. (clinical trials inadequate in this age group) • Screening women in age group [40, 49] • Relatively low chance of developing breast cancer • Mammogram sensitivity is lower for this age group • Relatively high cost • 1997 NIH Consensus Development Panel • Review of data from 8 clinical trials • “The available data did not warrant a single recommendation for all women in their forties.” • Nevertheless ACS and NCI recommend screening women in their 40’s.
Use of Model: Evaluating Benefit for Women Aged 40-49 • STRATEGY. Compare the mortality of a screened group ( exams only for women in their 40’s) with a control group. • Note that these subjects may die of disease past the age of 49. The population who were in the pre-clinical state in their 40’s is the target population who can benefit. • Clinical trials and recent data indicate a stage shift ( relative to usual diagnosis) with early detection for this age group. Node negative (good prognosis) ~77%% (screening) vs. 53% (usual care).
Mortality reduction : Screening in 40’s only * Counts all breast cancer deaths for ages 40-79. Exam Schedules by Age • 40, 41, 42, 43, 44, 45, 46, 47, 48, 49 (annual) • 40, 42, 44, 46, 48, 49 • 40, 43, 46, 49 • 40, 45, 49 • 40, 49 Conclusion : Women benefit from screening in their 40’s. However it would take an enormous clinical trial to demonstrate this benefit Conclusion : Women benefit from screening in their 40’s. However it would take an enormous clinical trial to demonstrate this benefit
Public Health Programs : Choosing Exam Schedules • Exam schedule consists of initial age to begin screening, the time between exams and the age to terminate exams. • Schedule should be dependent on risk status . Risk status depends on: • Natural history of disease ( Most chronic diseases are age dependent) • Model for disease • Incidence, prevalence • Special factors --- family history, co-morbid diseases,etc • Characteristics of examinations • Sensitivity • Specificity • Costs
Equal Intervals Between Exams • When are equal intervals optimal? • A necessary and sufficient condition that equal intervals between exams are optimal is when disease incidence is independent of time (age). • Not true for many chronic diseases: incidence may increase with age. • Hence many recommendations are sub-optimal.
Choosing Intervals According Risk :Threshold Method • Choose an age t0 to begin initial screening exam. This age corresponds to a probability P(t0 ) of being in the pre-clinical state (calculated from model). • Have an exam whenever the the probability of an individual reaches this threshold probability. • Alternatively, choose a threshold probability (P0) and have exams at ages ti whenever P(ti ) = P0.
Illustration of Threshold Method: Breast cancer Exam whenever risk is the same as at age 50. 0.008 0.006 Prob. of Being in Pre-clinical State 0.004 0.002 0 40 50 60 70 80 90 Age (Years) Intervals between exams
Threshold Method: • Women: ages 50-79 • Threshold value = P0(50)=0.0062 • 11 exams at ages (rounded) 50, 54, 57, 61, 63, 66, 69, 71, 74, 76, 78. • Avg. interval between exams = 2.5 years • Proportion of cases diagnosed by screening exam for ages 50-79 = 73% • Proportion of cases diagnosed by screening exam for ages 0-79 = 61%
Mammogram Exam Schedules for Ages [50, 79] • Annual: U.S.: ACS/NCI Recommendation • Every 2 Years: Scandinavian Recommendation • Every 3 Years: U.K. Recommendation * Mortality Reduction = [Mortality (controls) – Mortality (screened )] Mortality (controls)
Overdiagnosis: Prostate Cancer Over diagnosis: Lead Time > Residual Survival Residual Survival (Time from early diagnosis to death from other causes) Background: Prostate Specific Antigen (PSA) test is widely used to diagnose prostate cancer. A positive result triggers a biopsy Nearly all diagnosed cases by PSA are asymptomatic. Question: Would the prostate cancer exhibit clinical symptoms during a man’s lifetime? If not --- PSA diagnosis is an overdiagnosis Lead Time Age S0SpPSA DeathSpSc Diagnosis
Numerical Calculation : Prostate Cancer • Men ages, 50 to 80, have positive PSA test which leads to a positive biopsy. What is the probability of over diagnosis ? • Prob {no clinical cancer in man’s lifetime |PSA diagnosis at age A} • Probability of over diagnosis depends on age and mean sojourn time in pre-clinical state.
1.0 mean sojourn of 5 yr 0.8 mean sojourn of 7.5 yr mean sojourn of 10 yr mean sojourn of 12.5 yr mean sojourn of 15 yr 0.6 Probability of Over Diagnosis 0.4 0.2 0.0 Age 50 60 70 80 Probability of over diagnosis conditional on age of early detection : Prostate cancer Probability of over diagnosis conditional on age of early detection : Prostate cancer
Conclusions Early detection of chronic diseases has the potential of significant benefit (lower mortality , increased cure rates) Current recommendations for special exam programs not based on analytic considerations – weighing costs vs. benefits. Clinical trials to evaluate benefit require long term follow-up. Statistical models may be able to predict outcome using early clinical trial data. The advances in genomics are likely to generate candidate markers which may be used for the early detection of disease. Require a way of carrying out clinical trials which do not take a long time to complete. Need to estimate probability of over diagnosis with the discovery of markers.
My Collaborators • Sandra J. Lee , Dana_Farber Cancer Institute and Harvard School of Public Health • Yu Shen, M.D Anderson Cancer Center • Ping Hu , National Cancer Institute • Ori Davidov, Haifa University
Why would screening result in benefit ? • If screen diagnosed cases are found in an earlier disease stage compared to usual care then there is likely to be benefit. This is referred to as a stage shift. • Stage shift can be due to a long lead time ; i.e.cases are diagnosed before they transit to a more advanced prognostic stage. • Stage shift may also arise from the length biased sampling. The selection of cases by screening may also be associated with earlier prognostic stages.
Time in stages I and II Natural History of Disease Stage II Stage I Time (age) Sp Sc S0Sp S0: Disease Free State or Cannot Be Detected Sp: Pre-clinical State Sc: Clinical State
Stage Shift and Earlier Diagnosis Stage II Stage I Time (age) Sp Sc S0Sp S0: Disease Free State or Cannot Be Detected Sp: Pre-clinical State Sc: Clinical State Early Diagnosis Note : the longer the mean lead time the greater the probability of diagnosing disease in an earlier stage.
Mean lead time is calculated from theoretical distribution Proportion of negative nodes is data from clinical trials.