1.56k likes | 1.75k Views
Evidence-Based Medicine: Effective Use of the Medical Literature. Edward G. Hamaty Jr., D.O. FACCP, FACOI. Appraising Diagnosis Articles. Diagnosis. A diagnosis study is a prospective study with independent, blind comparison .
E N D
Evidence-Based Medicine:Effective Use of the Medical Literature Edward G. Hamaty Jr., D.O. FACCP, FACOI
Diagnosis • A diagnosis study is a prospective study with independent, blind comparison. • Diagnosis research design is different from the other types of research design discussed in this module and is not represented in the levels of evidence pyramid. Diagnosis research design involves the comparison of two or more diagnostic tests that are both applied to the same study population. One of the diagnostic tests applied to the study population is the reference standard, or “gold” standard; this tool acts as the standard of test sensitivity and specificity against which the other test is compared. Sensitivity and specificity are two measures that describe the efficacy of a diagnostic tool in comparison to the reference standard diagnostic tool. • Sensitivity is the proportion of people with the target disorder who have a positive test result. • Specificity is the proportion of people without the target disorder who have a negative test result. • To alleviate bias in diagnosis research, the reference standard and the test in question are applied independently and the researchers interpreting the results are blinded to the results of the other diagnostic test. • Diagnosis research design is also used evaluate screening tools.
Is the Study Valid? • 1) Was there a clearly defined question? • What question has the research been designed to answer? Was the question focused in terms of the population group studied, the target disorder and the test(s) considered?
Is the Study Valid? • 2) Was the presence or absence of the target disorder confirmed with a validated test ('gold' or reference standard)? • How did the investigators know whether or not a patient in the study really had the disease? • To do this, they will have needed some reference standard test (or series of tests) which they know 'always' tells the truth. You need to consider whether the reference standard used is sufficiently accurate. • Were the reference standard and the diagnostic test interpreted blind and independently of each other? • If the study investigators know the result of the reference standard test, this might influence their interpretation of the diagnostic test and vice versa.
Is the Study Valid? • 3) Was the test evaluated on an appropriate spectrum of patients? • A test may perform differently depending upon the sort of patients on whom it is carried out. A test is going to perform better in terms of detecting people with disease if it is used on people in whom the disease is more severe or advanced. • Similarly, the test will produce more false positive results if it is carried out on patients with other diseases that might mimic the disease that is being tested for. • The issue to consider when appraising a paper is whether the test was evaluated on the typical sort of patients on whom the test would be carried out in real life.
Is the Study Valid? • 4) Was the reference standard applied to all patients? • Ideally, both the test being evaluated and the reference standard should be carried out on all patients in the study. For example, if the test under investigation proves positive, there may be a temptation not to bother administering the reference standard test. • Therefore, when reading the paper you need to find out whether the reference standard was applied to all patients. If it wasn't, look at what steps the investigators took to find out what the 'truth' was in patients who did not have the reference test.
Is the Study Valid? • Is it clear how the test was carried out? • To be able to apply the results of the study to your own clinical practice, you need to be confident that the test is performed in the same way in your setting as it was in the study.
Is the Study Valid? • Is the test result reproducible? • This is essentially asking whether you get the same result if different people carry out the test, or if the test is carried out at different times on the same person. • Many studies will assess this by having different observers perform the test, and measuring the agreement between them by means of a kappa statistic. The kappa statistic takes into account the amount of agreement that you would expect by chance. If agreement between observers is poor, then the test is not useful.
Is the Study Valid? Kappa is often judged as providing agreement which is: Poor if: k ≤ 2.0 Fair if: 2.1 ≤ k ≤ 4.0 Moderate if: 4.1 ≤ k ≤ 6.0 Substantial if: 6.1 ≤ k ≤ 8.0 Good if: > 8.0
Is the Study Valid? • Κ=1 implies perfect agreement and Κ=0 suggests that the agreement is no better than that which would be obtained by chance. • There are no objective criteria for judging intermediate values. • However, kappa is often judged as providing agreement which is: • Poor if k ≤ 0.2 • Fair if 0.21 ≤ k ≤ 0.40 • Moderate if 0.41 ≤ k ≤ 0.60 • Substantial if 0.61 ≤ k ≤ 0.80 • Good if k > 0.80
Is the Study Valid? • The extent to which the test result is reproducible may depend upon how explicit the guidance is for how the test should be carried out. • It may also depend upon the experience and expertise of the observer.
Appraising diagnostic tests 1. Are the results valid? 2. What are the results? 3. Will they help me look after my patients?
Appraising diagnostic tests 1. Are the results valid? 2. What are the results? 3. Will they help me look after my patients?
Basic design of diagnostic accuracy study Series of patients Index test Reference (“gold”) standard Blinded cross-classification
Validity of diagnostic studies 1. Was an appropriate spectrum of patients included? 2. Wereallpatients subjected to the gold standard? 3. Was there an independent, blind or objective comparison with the gold standard?
1. Was an appropriate spectrum of patients included? Spectrum bias Selected Patients Index test Reference standard Blinded cross-classification
1. Was an appropriate spectrum of patients included? Spectrum bias • You want to find out how good chest X rays are for diagnosing pneumonia in the Emergency Department • Best = all patients presenting with difficulty breathing get a chest X-ray • Spectrum bias = only those patients in whom you really suspect pneumonia get a chest X ray
2. Wereall patients subjected to the gold standard?Verification (work-up) bias Series of patients Index test Reference standard Blinded cross-classification
2. Wereall patients subjected to the gold standard?Verification (work-up) bias • You want to find out how good is exercise ECG (“treadmill test”) for identifying patients with angina • The gold standard is angiography • Best = all patients get angiography • Verification (work-up bias) = only patients who have a positive exercise ECG get angiography
3. Was there an independent, blind or objective comparison with the gold standard? Observer bias Series of patients Index test Reference standard Unblinded cross-classification
3. Was there an independent, blind or objective comparison with the gold standard? Observer bias • You want to find out how good is exercise ECG (“treadmill test”) for identifying patients with angina • All patients get the gold standard (angiography) • Observer bias = the Cardiologist who does the angiography knows what the exercise ECG showed (not blinded)
Incorporation Bias Series of patients Index test Reference standard….. includes parts of Index test Unblinded cross-classification
Differential Reference Bias Series of patients Index test Ref. Std A Ref. Std. B Blinded cross-classification
Validity of diagnostic studies 1. Was an appropriate spectrum of patients included? 2. Were all patients subjected to the Gold Standard? 3. Was there an independent, blind or objective comparison with the Gold Standard?
DOR (Diagnostic Odds Ratio) Another measure for the diagnostic accuracy of a test is the diagnostic odds ratio (DOR), the odds for a positive test result in diseased persons relative to the odds of a positive result in non-diseased persons. The DOR is a single statistic of the results in a 2 x 2 table, incorporating sensitivity as well as specificity. Expressed in terms of sensitivity and specificity the formula is: DOR = [Sensitivity/(1 – Sensitivity)]/[(1 - Sensitivity)/Specificity]
Are the Results Important? • What is meant by test accuracy? • a The test can correctly detect disease that is present (a true positive result). • b The test can detect disease when it is really absent (a false positive result). • c The test can incorrectly identify someone as being free of a disease when it is present (a false negative result). • d The test can correctly identify that someone does not have a disease (a true negative result). • Ideally, we would like a test which produces a high proportion of a and d and a low proportion of b and c.
Are the Results Important? Sensitivity and specificity • Sensitivity is the proportion of people with disease who have a positive test. (True Positive) • Specificity is the proportion of people free of a disease who have a negative test. (True Negative)
Appraising diagnostic tests 1. Are the results valid? 2. What are the results? 3. Will they help me look after my patients?
Sensitivity, specificity, positive & negative predictive values, likelihood ratios …aaarrrggh!!
2 by 2 table Disease + - + Test -
2 by 2 table Disease + - + a b Test d c -
2 by 2 table Disease + - a True positives b False positives + Test c False negatives d True negatives -
2 by 2 table: sensitivity Disease + - Proportion of people with the disease who have a positive test result. .…a highly sensitive test will not miss many people + a Test c - Sensitivity = a / a + c
2 by 2 table: sensitivity Disease + - + 99 Test 1 - Sensitivity = a / a + c Sensitivity = 99/100 = 99%
2 by 2 table: specificity Disease Proportion of people without the disease who have a negative test result. ….a highly specific test will not falsely identify people as having the disease. + - + b Test d - Specificity = d / b + d
Tip….. • Sensitivity is useful to me • Specificity isn’t….I want to know about the false positives …so……use 1-specificity which is the false positive rate
2 by 2 table: Disease + - + a b Test d c - False positive rate = b/b+d (same as 1-specificity) Sensitivity = a/a+c
2 by 2 table: Disease + - + 99 10 Test 90 1 - False positive rate = 10% (same as 1-specificity) Sensitivity = 99%
Example Your father went to his doctor and was told that his test for a disease was positive. He is really worried, and comes to ask you for help! • After doing some reading, you find that for men of his age: • The prevalence of the disease is 30% • The test has sensitivity of 50% and specificity of 90% • “Son/Daughter, tell me what’s the chance I have this disease?”
100% Always • 50% maybe • 0% Never A disease with a prevalence of 30%. The test has sensitivity of 50% and specificity of 90%.
Prevalence of 30%, Sensitivity of 50%, Specificity of 90% Sensitivity = 50% Disease +ve 22 people test positive………. of whom 15 have the disease So, chance of disease is 15/22 about 70% 15 30 100 Testing +ve 7 70 Disease -ve False positive rate = 10% (1-Sp)
Try it again • A disease with a prevalence of 4% must be diagnosed. • The diagnostic test has a sensitivity of 50% and a specificity of 90%. • If the patient tests positive, what is the chance they have the disease?