Is it True? Evaluating Research about Diagnostic Tests

Is it True? Evaluating Research about Diagnostic Tests

The Case of Baby Jeff

The Case of Baby Jeff • CPK testing for Muscular Dystrophy • Sensitivity: 100% • Specificity: 99.98% • Prevalence: 1 in 5,000 (0.02%)

20 correctly positive 20 false positive 0 false negative 99,960 correctly negative Positive results Negative results 100,000 newborn boys 20 will have M.D. Prevalence = 1 in 5,000 = .02% = 20 newborn boys 99,980 – no M.D. Specificity 99.98% Sensitivity 100% 40 positive tests 50% truly positive 50% falsely positive 50% PPV 99,960 negative tests 100% truly negative 0 falsely negative 100% NPV

Why this is important http://today.msnbc.msn.com/id/42829175

Other examples Lyme disease Sensitivity= 95%; specificity= 95% High prevalence (20%): PPV =83% Low prevalence (2%): PPV = 28% Echocardiogram as part of executive physical Prevalence = 10%; PPV = 50%

Technical vs. Clinical Precision Unaffected by prevalence Changed by prevalence

Predictive Values • Positive Predictive Value • The percentage of patients with a positive test who have the disease • Negative Predictive Value • The percentage of patients with a negative test who don’t have the disease

Let’s practice Task 1. A serum test screens pregnant women for babies with Down’s syndrome. The test is a very good one, but not perfect. Roughly, 1% of babies have Down’s syndrome. If the baby has Down’s syndrome, there is a 90% chance that the result will be positive. If the baby is unaffected, there is still a 1% chance that the result will be positive. A pregnant woman has been tested and the result is positive.

Positive results Negative results 1,000 similar Prevalence = 1% = ___ patients/1,000? Negative: 99% correctly identified Positive: 90% correctly identified

9 correctly positive 10 false positive 1 false negative 980 correctly negative Positive results Negative results Down’s Syndrome 1,000 similar patients 10 – Downs Prevalence = 1% = 10 with Downs 990 No Downs Positive: 90% correctly identified Negative: 99% correctly identified 19 positive tests 47.5% truly positive 52.5 falsely positive 981 negative tests 99.99% truly negative 0.001% falsely negative

Task 2 A 45-year-old woman presents with a sore throat and cough but without fever, tonsillar exudate, or cervical nodes. Using a clinical decision rule, you determine her likelihood of having strep throat is 1%. However, according to your office protocol, your medical assistant already has performed a rapid strep (antigen) test, which is positive. What is the likelihood the patient has strep throat now? Antigen test -- Sensitivity: 88% Specific: 96%

880 correctly positive 3,960 false positive 120 false negative 95,040 correctly negative Positive results Negative results Strep throat 100,000 similar patients 1,000 – Strep Prevalence = 1% = 1,000 with strep 99,000 – viral Specificity 96% Sensitivity 88% 4840 positive tests 18% truly positive 82% falsely positive 18% PPV 95,160 negative tests 99.87% truly negative 0.126% falsely negative 99.87% NPV

Adopting new screening/diagnostic tests • Sensitivity/specificity not enough • Testing as an intervention • Did the authors study an outcome patients care about?

Levels of “POEMness” for Diagnostic Tests • Sensitivity & specificity • Does it change diagnoses? • Does it change treatment? • Does it change outcomes? • Is it worthwhile (to patients and/or society)? (examples: HbA1C for DM, CPK vs T4/PKU in newborns, electron beam tomography for CAD) Fryback DG, Thornbury JR. The efficacy of diagnostic imaging. Med Decis Making 1991; 11:88-94

Screening pulse oximetry for CHD • Diagnostic performance of abnormal pulse oximetry for congenital heart defects • for all major congenital defects * sensitivity 49.06% * specificity 99.016% * positive predictive value 13.33% * negative predictive value 99.86% • for critical congenital defects * sensitivity 75% * specificity 99.12% * positive predictive value 9.23% * negative predictive value 99.97% Lancet 2011 Aug 27;378(9793):785

Screening pulse oximetry for CHD Jaundice, terminating breast-feeding, and the vulnerable child Breast-feeding was more common in the jaundiced group (61% vs 79%). By 1 month, more mothers of jaundiced infants had completely stopped breast-feeding (19% vs 42%). They were more likely to have never left the baby with anyone else (including the father) or left the baby at most one time for less than 1 hour (15% vs 31%), more well-visits, more ED visits (2% v 11%, not including bili measurements). Thus, may increase the risk for premature termination of breast-feeding and for development of the VULNERABLE CHILD SYNDROME. Pediatrics 1989 Nov;84(5):773-8

Naming is not curing • In the 1600s, astrology dominated medicine as a healing profession. Neither worked but astrology was much more popular because it focused on fixing people's problems. Medicine, on the other hand, focused mainly on categorizing illnesses (i.e., diagnosing) and not so much on treatment. • 400 years later there is still a priority on categorizing, regardless of whether it's helpful. A correct diagnosis is only useful when it results in the selection of a treatment that benefits the patient; otherwise, it's only a label. James Burke. The day the Universe Changed. Boston: Little, Brown and Company, 1985, p. 333.

TEST + TEST -

Sensitivity TEST + TEST -

Specificity TEST + TEST -

Positive Predictive Value TEST + TEST -

Negative Predictive Value TEST + TEST -

Likelihood Ratios Similar to the concepts of “ruling in” and “ruling out” disease Pre Test Odds x LR = Post Test Odds The problem – we don’t think in terms of odds Clinical decision rules: Do the hard math for us, be we need to enter the appropriate data and interpret results

II. Are The Results Valid? Diagnostic test compared with the “Gold standard” on all patients Blinded comparison Independent testing Consecutive patient enrollment (adequate spectrum of disease) (Must have all for LOE = 1b)

II. Are The Results Valid? What are the results? Sensitivity, specificity and predictive values Likelihood ratio calculation Prevalence of disease in the study population Typical? Similar to your practice?

Levels of “POEMness” for Diagnostic Tests Sensitivity & specificity Does it change diagnoses? Does it change treatment? Does it change outcomes? Is it worthwhile (to patients and/or society)? (examples: HbA1C for DM, CPK vs T4/PKU in newborns, electron beam tomography for CAD) Fryback DG, Thornbury JR. The efficacy of diagnostic imaging. Med Decis Making 1991; 11:88-94

Is it True? Evaluating Research about Diagnostic Tests

Is it True? Evaluating Research about Diagnostic Tests

Presentation Transcript

Rapid diagnostic tests

Revising FDA’s “Statistical Guidance on Reporting Results from Studies Evaluating Diagnostic Tests”

Diagnostic Tests

Diagnostic et suivi virologique de l infection par le VIH

Methods for Evaluating the Performance of Diagnostic Tests in the Absence of a “Gold Standard:” A Latent Class Model A

Methods for Evaluating the Performance of Diagnostic Tests in the Absence of a “Gold Standard:” A Latent Class Model

Diagnostic tests

Diagnostic tests

Your task is to respond to the following with a partner:

DIAGNOSTIC TESTS

Diagnostic Tests Evaluation

DIAGNOSTIC TESTING – A TALE OF TWO INITIATIVES Using diagnostic tests to assist in

Principles of FDA Regulation for In Vitro Diagnostic Tests for Home Use

Diagnostic tests

The clinical value of diagnostic tests A well-explored but underdeveloped continent

Online diagnostic tests

DIAGNOSTIC TESTS

Studying the Impact of Tests

FDA Regulation of In Vitro Diagnostic Tests

Mystery client survey on malaria rapid diagnostic tests

Is it True? Evaluating Research about Diagnostic Tests