The Script-Concordance Test as a Measure of Clinical Reasoning: a National Validation Study

The Script-Concordance Test as a Measure of Clinical Reasoning: a National Validation Study Sarkis Meterissian, Marylise Boutros, Thamer Nouh, Robert Gagnon, Susan Reid, Ken Leslie, David Pace, Dennis Pitt, Ross Walker, Dan Schiller,Anthony MacLean, Mourad Hameed, Paola Fata and Bernard Charlin From the Universities of Memorial, Montreal, Ottawa, Queen’s, Western, McMaster, Alberta, Calgary, UBC and McGill Association for Surgical Education Paper Session 1 March 22, 2011

Disclosures The authors have no disclosures to declare

Decision-making in Surgery

Assessment tool formats • MCQ • Rich context MCQ • Orals • Short answer questions… • Long answer questions… The perfect assessment tool?

Theory of Mental Representation Schmidt et al: clinicians move through 3 kinds of mental representations: 1. Basic mechanisms of disease 2. Illness scripts 3. Bank of cases derived from experience

SCRIPT Theory • In clinical situations, clinicians consider alternative hypotheses • Then a deductive process is used to seek information to either accept or refute each respective hypothesis • A test has been developed that measures this deductive-reasoning process – The Script Concordance Test (SCT)

Preliminary Study • Conducted at McGill University • 100-question exam • Cronbach alpha = 0.70 • After optimisation = 0.85 (62 items)

Resident Level n Mean Score (%) Std Deviation (%) R1 8 52.5 9.96 R2 6 62.5 5.12 R3 9 68.3 9.19 R4 6 75.7 9.61 R5 7 68.0 6.44 Preliminary Study Examination Scores (62-item test) by Resident Level

Purpose To determine if the script-concordance test maintains its reliability and validity across nine General Surgery programs in Canada.

Script Concordance Test • A challenging context • Authentic clinical situation • Not enough data • A Likert scale • Capture opinion • Scoring system • Takes into account variability of experts • Aggregate scoring method

The Qualities of a Good SCT • There must be at least 50-60 questions to achieve anacceptable reliability coefficient (Cronbach alpha= .80) • The reference panel must include at least 15 experts (Adv Health Sci Educ Theory Pract 2011 Feb., epub ahead of print) • Ideally there should be 2-4 questions nested into 15-25 cases (Adv Health Sci Edu Theory Pract 2009;14:367-75)

If you were planning… and found… the planned management is: An open cholecystectomy That the patient was in the first trimester -2 -1 0 +1 +2 A laparoscopic cholecystectomy That the patient is in the second trimester and has gestational diabetes -2 -1 0 +1 +2 A 28 year-old pregnant woman presents to the ER with severe biliary colic. This is her third attack and an ultrasound reveals gallstones. In your management:

The Scoring Grid (Modal Experts ’ Choice)

Methods Statistical Analysis: • Reliability: Cronbach alpha coefficient • Analysis of item-to-total item correlation used to select the best items for the final analysis • Construct validity tested with a one-way ANOVA with post-hoc comparisons test and planned contrasts • All p values at alpha<5% were considered significant

Demographics: Universities: McGill, Memorial, Queen’s, Ottawa, Western, McMaster, U of Alberta, Calgary, UBC Residents: 202 total R1: 51 R2: 45 R3: 45 R4: 28 R5: 33

Demographics Number of Residents Panel McGill 39 7 Alberta 36 3 Calgary 17 0 McMaster 25 0 Memorial 10 0 Ottawa 19 4 Queen’s 17 5 UBC 20 0 Western 19 2

Statistics • 153 question examination (face-validated by 4 PDs) • 22 questions eliminated with negative item/total item correlation • Cronbach alpha (131 questions): .850

Scores by Resident Level Level Mean SEM Min. Max. R1 60.65 1.22 40.7 76.3 R2 67.94 0.80 57.7 76.0 R3 68.15 1.05 49.1 77.8 R4 69.05 1.24 55.7 81.1 R5 66.5 1.80 51.8 79.7

Scores by Resident Level

Juniors vs Seniors Alberta Calgary McM. McGill Memorial R1/R2 66+7.6 66+8.7 68+7.3 63+7.2 67+1.6 R3/R4/ 68+5.8 73+6.5 71+7.4 67+7.8 73+3.9 R5

Juniors vs Seniors Queen’s UBC UWO Ottawa R1 +R2 71+3.8 67+7.5 66+8.3 70+5.4 R3/R4/R5 76+3 68+5.7 69+8.9 72+7.4

Juniors vs Seniors

Juniors vs Seniors Total N R1/R2 67+7.2 96 R3/R4/R5 70+7.3 106 P<0.001

Discussion The largest SCT study to date has shown that: • An SCT with face validity can be developed and administered to General Surgery residents • An SCT exceeding 100 items can be highly reliable • The SCT can distinguish junior from senior residents across 9 Canadian General Surgery Programs

Discussion • Why did the scores of the R5s fall? - use sub-specialists to score the exam • Can the exam be used to identify residents having difficulty with clinical decision-making? The outliers!!! • Can the exam itself be used as a remediation tool? • Can an SCT be used for high-stakes examinations such as certification?

We so long for certainty in this changing world, and the younger we are the more we seem to need it, so that it seems hard for me to tell you that you must not look for it in diagnosis. Sir William Osler

Scores per University University Score SD McGill 65.446 7.7414 Alberta 67.031 6.8713 Calgary 69.287 8.2948 McMaster 70.203 7.4304 Memorial 70.306 4.3231 Ottawa 70.811 6.7664 Queen’s 73.857 4.1723 UBC 67.272 6.9838 UWO 67.683 8.5443

McGill Panel vs non-McGill

Scores by Resident Level

If you were planning… and found… the planned management is: To resect the diverticulum A wide base to the diverticulum -2 -1 0 +1 +2 To resect the diverticulum A perforated appendicitis with a pelvic abscess -2 -1 0 +1 +2 To leave the diverticulum That the tip was adherent to the umbilicus -2 -1 0 +1 +2 A 31 year old male is undergoing an appendectomy and after completion of the operation, an incidental Meckel’s diverticulum is found. In your management:

Methods: Scoring an SCT • The scoring grid of an SCT should be derived by administering it to at least 10 experts (aggregate scoring method) • Based on their responses some questions (if too vague or too clearcut) can be eliminated • Scoring grid takes into account variability due to the uncertainty of the clinical situation

The Script-Concordance Test as a Measure of Clinical Reasoning: a National Validation Study

The Script-Concordance Test as a Measure of Clinical Reasoning: a National Validation Study

Presentation Transcript

Validity

Test Taking Tips and Strategies

Continuing Professional Development (CPD) National Podiatry Survey

Clinical Chemistry

Clinical Trials: Cost Coverage and Implications for the Clinical Trials Nurse

GMFM: Gross Motor Function Measure, Part I

Clinical Trials

Qualification and Validation

From Case-Based Reasoning to Traced Experience Based Reasoning

Supervised machine learning

The Odyssey Test Review

The Odyssey Test Review

Maps test study guide

Assuring the Quality of Laboratory Testing in Countries Fighting the HIV/AIDS Epidemic

Thesis Defense: Incremental Validation of Formal Specifications

Process Validation – What the Future Holds

Test of Significance

Classification and Bias of Clinical Research, with a Randomized Clinical Trial Case Study

Clinical Trials in Rare Diseases Methodological Issues

M2 Medical Epidemiology

QUALITY ASSURANCE AND VALIDATION FOR BIOMANUFACTURING

Clinical Reasoning Assessment