440 likes | 741 Views
The Script-Concordance Test as a Measure of Clinical Reasoning: a National Validation Study. Sarkis Meterissian, Marylise Boutros, Thamer Nouh, Robert Gagnon, Susan Reid, Ken Leslie, David Pace, Dennis Pitt, Ross Walker,
E N D
The Script-Concordance Test as a Measure of Clinical Reasoning: a National Validation Study Sarkis Meterissian, Marylise Boutros, Thamer Nouh, Robert Gagnon, Susan Reid, Ken Leslie, David Pace, Dennis Pitt, Ross Walker, Dan Schiller,Anthony MacLean, Mourad Hameed, Paola Fata and Bernard Charlin From the Universities of Memorial, Montreal, Ottawa, Queen’s, Western, McMaster, Alberta, Calgary, UBC and McGill Association for Surgical Education Paper Session 1 March 22, 2011
Disclosures The authors have no disclosures to declare
Assessment tool formats • MCQ • Rich context MCQ • Orals • Short answer questions… • Long answer questions… The perfect assessment tool?
Theory of Mental Representation Schmidt et al: clinicians move through 3 kinds of mental representations: 1. Basic mechanisms of disease 2. Illness scripts 3. Bank of cases derived from experience
SCRIPT Theory • In clinical situations, clinicians consider alternative hypotheses • Then a deductive process is used to seek information to either accept or refute each respective hypothesis • A test has been developed that measures this deductive-reasoning process – The Script Concordance Test (SCT)
Preliminary Study • Conducted at McGill University • 100-question exam • Cronbach alpha = 0.70 • After optimisation = 0.85 (62 items)
Resident Level n Mean Score (%) Std Deviation (%) R1 8 52.5 9.96 R2 6 62.5 5.12 R3 9 68.3 9.19 R4 6 75.7 9.61 R5 7 68.0 6.44 Preliminary Study Examination Scores (62-item test) by Resident Level
Purpose To determine if the script-concordance test maintains its reliability and validity across nine General Surgery programs in Canada.
Script Concordance Test • A challenging context • Authentic clinical situation • Not enough data • A Likert scale • Capture opinion • Scoring system • Takes into account variability of experts • Aggregate scoring method
The Qualities of a Good SCT • There must be at least 50-60 questions to achieve anacceptable reliability coefficient (Cronbach alpha= .80) • The reference panel must include at least 15 experts (Adv Health Sci Educ Theory Pract 2011 Feb., epub ahead of print) • Ideally there should be 2-4 questions nested into 15-25 cases (Adv Health Sci Edu Theory Pract 2009;14:367-75)
If you were planning… and found… the planned management is: An open cholecystectomy That the patient was in the first trimester -2 -1 0 +1 +2 A laparoscopic cholecystectomy That the patient is in the second trimester and has gestational diabetes -2 -1 0 +1 +2 A 28 year-old pregnant woman presents to the ER with severe biliary colic. This is her third attack and an ultrasound reveals gallstones. In your management:
Methods Statistical Analysis: • Reliability: Cronbach alpha coefficient • Analysis of item-to-total item correlation used to select the best items for the final analysis • Construct validity tested with a one-way ANOVA with post-hoc comparisons test and planned contrasts • All p values at alpha<5% were considered significant
Demographics: Universities: McGill, Memorial, Queen’s, Ottawa, Western, McMaster, U of Alberta, Calgary, UBC Residents: 202 total R1: 51 R2: 45 R3: 45 R4: 28 R5: 33
Demographics Number of Residents Panel McGill 39 7 Alberta 36 3 Calgary 17 0 McMaster 25 0 Memorial 10 0 Ottawa 19 4 Queen’s 17 5 UBC 20 0 Western 19 2
Statistics • 153 question examination (face-validated by 4 PDs) • 22 questions eliminated with negative item/total item correlation • Cronbach alpha (131 questions): .850
Scores by Resident Level Level Mean SEM Min. Max. R1 60.65 1.22 40.7 76.3 R2 67.94 0.80 57.7 76.0 R3 68.15 1.05 49.1 77.8 R4 69.05 1.24 55.7 81.1 R5 66.5 1.80 51.8 79.7
Juniors vs Seniors Alberta Calgary McM. McGill Memorial R1/R2 66+7.6 66+8.7 68+7.3 63+7.2 67+1.6 R3/R4/ 68+5.8 73+6.5 71+7.4 67+7.8 73+3.9 R5
Juniors vs Seniors Queen’s UBC UWO Ottawa R1 +R2 71+3.8 67+7.5 66+8.3 70+5.4 R3/R4/R5 76+3 68+5.7 69+8.9 72+7.4
Juniors vs Seniors Total N R1/R2 67+7.2 96 R3/R4/R5 70+7.3 106 P<0.001
Discussion The largest SCT study to date has shown that: • An SCT with face validity can be developed and administered to General Surgery residents • An SCT exceeding 100 items can be highly reliable • The SCT can distinguish junior from senior residents across 9 Canadian General Surgery Programs
Discussion • Why did the scores of the R5s fall? - use sub-specialists to score the exam • Can the exam be used to identify residents having difficulty with clinical decision-making? The outliers!!! • Can the exam itself be used as a remediation tool? • Can an SCT be used for high-stakes examinations such as certification?
We so long for certainty in this changing world, and the younger we are the more we seem to need it, so that it seems hard for me to tell you that you must not look for it in diagnosis. Sir William Osler
Scores per University University Score SD McGill 65.446 7.7414 Alberta 67.031 6.8713 Calgary 69.287 8.2948 McMaster 70.203 7.4304 Memorial 70.306 4.3231 Ottawa 70.811 6.7664 Queen’s 73.857 4.1723 UBC 67.272 6.9838 UWO 67.683 8.5443
If you were planning… and found… the planned management is: To resect the diverticulum A wide base to the diverticulum -2 -1 0 +1 +2 To resect the diverticulum A perforated appendicitis with a pelvic abscess -2 -1 0 +1 +2 To leave the diverticulum That the tip was adherent to the umbilicus -2 -1 0 +1 +2 A 31 year old male is undergoing an appendectomy and after completion of the operation, an incidental Meckel’s diverticulum is found. In your management:
Methods: Scoring an SCT • The scoring grid of an SCT should be derived by administering it to at least 10 experts (aggregate scoring method) • Based on their responses some questions (if too vague or too clearcut) can be eliminated • Scoring grid takes into account variability due to the uncertainty of the clinical situation