Understanding and Evaluating Screening Programs in Health

Evaluating Screening Programs Dr. Jørn Olsen Epi 200B January 19, 2010

Definition of Screening • Last: The presumptive identification of unrecognized disease or defect by the application of tests, examinations or other procedures which can be applied rapidly. Screening tests sort out apparently well persons who probably have a disease from those who probably do not. A screening test is not intended to be diagnostic. Persons with positive or suspicious findings must be referred to their physicians for diagnosis and necessary treatment.

Screening is not about diagnosing patients. • The aim is to identify people at high risk of having the disease. • The screening test is not a diagnostic test. • A positive screening test is a cue for using a diagnostic test-not a cue for treatment.

Which conditions speak in favor of a screening program?

A health problem that can be treated • An acceptable test with no or few side effects

Justification for screening: early treatment improves prognosis at reasonable cost • Screening requires a screening test. • We talk about a test’s sensitivity, specificity and predictive value. • What characterizes a good screening test?

Safe • Acceptable • Inexpensive • Good predictive values

A population sens = a/M1 spec = d/M2 predictive value pos test = a/M3 Parameters; sens = P (test+ D); spec = P (test- D) Predictive value pos test = P (D test+) Predictive value of a neg test = P (D test-) These are conditional probabilities

PP x sens PP x sens + (1-PP) (1-spec) (1-PP) spec PP x (1-sens) + (1-PP) spec Predictive value of pos test P(D test+) = Predictive value of a negative test P(D test-) =

Bayes’ formula – predictive value depends upon sens, spec and PP, the prevalence proportion. From prior probability, PP, to a postereiori probability P(D test+) • 1763 Richard Price presented a paper by Thomas Bayes “An essay toward solving a problem in the doctrine of chances”.

Is epoxy carcinogenic?

Sens. 157/18+157 = 0.90 Spec. 94/108 = 0.87 Epoxy had a positive Ame’s test 90% probability that it is carcinogenic? Depends upon the prior probability If 1% of all chemicals screened are carcinogenic

Predictive value of a positive test 900/13770 = 6.5% from 1% to 6.5% Predictive value of negative test 56130/86230= 99.9% from 99% to 99.9%

Benefits and side effects of screening a: True positives detected at screening – would benefit if detected before critical point c: False negatives diseased but not detected at screening. Screening may delay their diagnosing b: False positives are called in for diagnostic work up – are worried and diagnostic tests may carry risks d: True negatives are happy and like the program Main design issue: screening may have positive as well as negative effects. The sensitivity and specificity of the tests are key parameters together with the nature of the test, the disease and its treatment.

Test values for HEME Select (colorectal cancer)population data Sens = 22/32 = 0.688 Spec = 7043/7461 = 0.944 Predictive value of post test = 22/440 = 0.050

In a clinical setting data could be Predictive value of post test = 688/744 = 0.925

Sensitivity will often depend on the stage of the disease and may well be lower for early stages of the disease the predictive value of the test is closely dependent on the prevalence proportion of the disease. For this test, predictive value of pos test is 0.11 if colon cancer has a prevalence proportion of 0.01 and 0.01 if PP is 0.001.

P (test + D) P (test + D) P (test - D) P (test - D) Sens 1-spec 1-sens spec Prior probability 1-prior probability Posterior odds 1+ posterior odds Likelihood ratios (LR) LR+ = = LR- = = An easy way to use Bayes’ theorem Prior odds = Posterior odds = prior odds x LR Posterior probability =

0.30 0.70 0.90 0.40 0.97 1+ 0.97 Screening for alcoholism; test sens = 0.90, spec = 0.60 Prior probability of alcoholism 0.30, then Prior odds = = 0.43 LR+ = = 2.25 Posterior odds = 0.43 x 2.25 = 0.97 Posterior probability = = 0.49 You have increased your probability from 0.30 to 0.49 given the test was positive.

Screening may have negative as well as positive effects; a screening program should therefore be evaluated. It is not enough to show that those who were detected in a screening program had a longer survival than those not screened. For this patient, the clinical survival time is td-tc and the screening survival time is td-ts; tc-ts longer. This time interval produces “lead time bias”. dead healthy time td ts tc

IR after screening without screening screening

All classical designs have been used in evaluating screening programs. Main concerns: Since screening programs usually have both positive and negative effects, the case-control design may not be the best choice. Why is that? RCT: need to be large, may be out of date when finished, unbiased cause specific mortality may difficult to obtain, difficult to randomize at individual level. Does not address normal practice. No “confounding by indication” argument for doing a RCT. Follow-up: who comply to the program, high risk/low risk? Case-control: no possibility to include all effects of interest Ecological: ecological phallacy, but may be the best evidence after all

Additional design issues • Screening may address an early pre-disease lesion (adenoma) or cancer at an early stage. In the first situation, screening may reduce incidence but may have little impact on case fatality. In the second situation, screening should reduce incidence (and case fatality?). In both situations, cause specific mortality should be reduced (and total mortality?).

Additional design issues • A case-control addressing the first issue includes incident cases. For the second issue, cases are cause specific deaths. • The source population are those who are invited to be screened and belong to the population at risk.

Additional design issues • Incidence density sampling of controls is usually the only option. • Exposure is being screened in a given time interval up to case selection. D M

Gotzsche et al. Is screening for breast cancer with mammography justifiable? Lancet. 2000;355:129-34.

France In France BC cancer incidence is increasing cause specific mortality rates are stable BC screening is increasing. How could this be explained?

Randomized trials include 500,000 women but results differ and no conclusion has been reached. Review of trial according to: quality of randomization blinding of outcome assessment exclusion after randomization

New York trial Pairs of women were matched and the pairs were randomized but imbalance on previous lump in the breast, menopause, education

Edinburgh trial Cluster randomization of GPs difference in social conditions

Canadian trial Individual randomization

Stockholm trial Allocated according to data of birth Born 11-20 of any month allocated to the control group inconsistency in numbers

Gothenburg trial Data of birth and individual difference in age

Other Swedish trials Randomization of counties Differences in age

Peter C Gotzsche, et al. Public Health

Other quality criteria?

Conclusion: Screening for breast cancer with mammography is unjustified?

Understanding and Evaluating Screening Programs in Health

Understanding and Evaluating Screening Programs in Health

Presentation Transcript

Evaluating HRD Programs

Evaluating Nutrition Education Programs

Evaluating Diversity Programs

Evaluating HRD Programs

IMAGED BASED SCREENING PROGRAMS

Evaluating behaviour change programs

Evaluating NSF Programs

Evaluating School Literacy Programs

Evaluating Title IIID Programs

Evaluating Health Promotion Programs

Evaluating programs

Evaluating Parallel Programs

Evaluating “By Appointment” Programs

Evaluating Health Communications Programs

Evaluating Training Programs

Chemical Screening Programs

Evaluating HRD Programs

Evaluating HRD Programs

Screening and Evaluating Candidates

Evaluating HRD Programs

Evaluating HRD Programs