320 likes | 536 Views
Assessment of Clinical Competence Using Mannequin-based Simulation. Howard A. Schwid, M.D. University of Washington and VA Puget Sound Health Care System. Objectives. Review current evaluation of anesthesia residents Describe elements of validity and reliability in performance assessment
E N D
Assessment of Clinical Competence Using Mannequin-based Simulation Howard A. Schwid, M.D. University of Washington and VA Puget Sound Health Care System
Objectives • Review current evaluation of anesthesia residents • Describe elements of validity and reliability in performance assessment • Summarize published studies of simulator assessment in anesthesiology • Describe preliminary results of the multi-institutional study of simulator assessment
Why Improve Evaluation? • Current evaluation methods have limitations • Identify the problem resident early • Improve training program • Objective documentation of unacceptable performance prior to dismissal
Current Evaluation of Anesthesia Residents • Multiple choice test • Departmental faculty clinical evaluation • Mock oral boards
Current Evaluation - Multiple Choice Exam • Objective assessment • Questions may not be clinically relevant • Resident may not be able to apply book knowledge to clinical practice
Current Evaluation - Faculty Clinical Evaluation • Subjective • May be based on minimal contact • May be based on single incident • Halo effect
Current Evaluation - Mock Oral Board Exam • Subjective • Evaluators may have inadequate training • Highly dependent on specific case • May be biased since evaluators usually know resident
Simulator Assessment of Clinical Competence • Management of anesthetic emergencies • Esophageal intubation • Anaphylaxis • Bronchospasm • Myocardial ischemia • Standardized scenarios • Standardized grading checklist
Validity and Reliability in Performance Assessment • Validity - ability of a test to measure what it is supposed to measure • Reliability - ability of a test to return the same results - reproducible
Validity of a Test • Does it measure what its supposed to? • Difficult to measure validity unless there is a gold standard • Content (face) validity • Construct validity • Criterion validity
Content Validity • Seems reasonable that a competent anesthesia resident should be able to manage those events • Does not assess many areas of clinical competence: • Anesthetic plan • Communication with co-workers • Charting • Psychomotor skills
Construct Validity • Realism of the scenarios • Problems: • Human - different behavior in simulator • Hypervigilance • Cavalier • Technical • Breath sounds, pulses • CO2 waveform, ECG waveform • Model predictions
Criterion Validity • Compare to items external to the test • No gold standard • Correlation of simulator scores with current assessment methods • Progression of simulator scores with training
Reliability • Ability of a test to return the same results • Internal consistency - do items in test give same results? • Cronbach’s alpha statistic • Inter-rater reliability - do different raters give same results? • Kappa statistic • Intraclass correlation coefficient (ICC)
Published Studies • Standardized Patients (SP) - normal person or actual patient coached to present an illness in a standardized, unvarying way • Mannequin-based anesthesia simulator = the electronic standardized patient • OSCE - Objective Structured Clinical Exam - multiple stations of SPs
Published Studies • OSCE • Major source of error is variation of examinee performance from station to station. Test must include many stations. Need 3-4 hours of testing. • Inter-rater reliability is good if clear scoring criteria. Need only one grader. • Scores for history, physical and communication more reproducible than differential diagnosis or treatment.
Published Studies • Morgan and Cleave-Hogg: Toronto. Evaluation of medical students’ performance using the anaesthesia simulator. Med Educ 2000; 34: 42-45 • 24 students, 6 cases each • Poor correlation between simulator and written and clinical evaluation • Good inter-rater reliability
Published Studies • Devitt and Kurrek, et al: Toronto. Testing the raters: inter-rater reliability of standardized anaesthesia simulator performance. Can J Anaesth 1997; 44: 924-928 • 10 cases, three subjects each, two raters • Defined grading criteria • Excellent inter-rater reliability
Published Studies • Devitt and Kurrek, et al: Toronto. Testing internal consistency and construct validity during evaluation of performance in a patient simulator. Anesth Analg 1998; 86: 1160-4 • 25 residents and faculty, 10 cases each • 4 scenarios had poor internal consistency • Validity - Faculty outscored residents
Published Studies • Mayer, Freid, Boysen: UNC. Validation of simulation as an evaluation method of anesthesiology residents. Anesthesiology 1999; 91: A1130 • 15 residents, 1 case each • Validity - progression of scores from CA-1 to CA-2 to CA-3 residents
Published Studies • Gaba, et al. Stanford. Assessment of clinical performance during simulated crises using both technical and behavioral ratings. Anesthesiology 1998; 89; 8-18 • 2 scenarios, 14 teams • Found good inter-rater reliability for technical performance • Found minimally acceptable to poor inter-rater reliability for behavioral performance
Multi-Institutional Evaluation of Mannequin-based Simulator Assessment of Clinical Competence • 4 standardized scenarios • Grading checklist - 65 items • 10 institutions • METI and MedSim simulators
Multi-Institutional Study • Cleveland Clinic Foundation • Indiana University • Penn State • University of California - Los Angeles • University of Kansas • University of Pittsburgh • University of Rochester • University of Washington • Wake Forest University • West Virginia University
Multi-Institutional Study • 105 subjects • 11 CBL-1 • 55 CA-1 • 25 CA-2 • 14 CA-3 • Testing completed September 1, 2000
Multi-Institutional Study • Construct validity • Survey subjects about realism • Content validity • Compare simulator scores to written exam, faculty evaluations, mock oral exam • Progression of scores
Multi-Institutional Study • Reliability • Internal consistency of items scored • Between simulators - METI and MedSim • Inter-rater reliability • Two raters within department • Raters between departments
Multi-Institutional Study • Practicality • Time for each assessment - 75 min/subject • Cost of each assessment • Simulator failures • Failures of other equipment
Assessment of Clinical Competence Using Mannequin-based Simulation • Summary • Shows promise as objective measure • Further work needed to improve simulator realism • Further work needed to develop valid and reliable grading system • Practicality to be decided by each department