Simulation-Based Assessment Center for Medical School Candidates

Development and Application of a Simulation-Based Assessment Center for Non-Cognitive Attributes: Screening of Candidates to Tel-Aviv University Medical School Presented to: National Examinations Centre (NAEC) Tbilisi, Georgia 25 September, 2007

Tel Aviv University (TAU) Sackler Faculty of Medicine Israel Center for Medical Simulation (MSR) National Institute for Testing and Evaluation (NITE)

Authors& Researchers: • NITE • Naomi Gafni • Orit Rubin • Avital Moshinsky • Avi Allalouf • MSR • Amitai Ziv • TAU • Moshe Mittelman • Dov Lichtenberg

Background • Despite growing acknowledgment of the importance of non-cognitive factors, most medical schools currently rely primarily, and sometimes exclusively, on cognitive factors in student selection. • The tool most commonly used in an attempt to consider non-cognitive factors is the interview, which: • has strong face validity • suffers from rater biases, context biases, unsatisfactory reliability & validity

The Motivation Behind the Project • Growing dissatisfaction among TAU faculty members: “While the selected candidates show extremely high academic capabilities, some of them have problematic personal and interpersonal characteristics.” • Medical School Leadership – Ready for a revolutionary change… • In the pipeline for over 2 years… • The Task: to select the 100 candidates most suited to the study and practice of medicine from among some 1,700 candidates • Given: • A Psychometric Entrance Test Score (PET) • High School Grade Point Average (GPA) • Disqualifying Interviews (“Lip-Service”) • Missing: • Non-cognitive measures • Value message to candidates

The Goal of the Assessment Center To improve the screening of medical school candidates by introducing non-cognitive measurements into the equation

Assessment Center: Essential Elements • Job analysis • Definition of behavioral profile • Simulations • Multiple methods and multiple assessments • Multiple raters • Raters’ training • Data integration

“Job Analysis” – Behavioral Profile • Ethical attitude • Honesty • Professional commitment and responsibility • Empathy • Service awareness • Commitment for the patient • Inter-personal communication skills • Self confidence • Sensitivity • The ability for detail observation • The ability to identify a need for help, search for help and accept help • Openness • Initiative • Attitude towards authority • Self awareness • Maturity • The ability to function under stress

Developmental & Logistical Milestones • Development of structure, content (cases and tasks), rating scales and rating workshops • 4 months work • Test Development Committee - Admission committee members, MSR experts, NITE experts • Faculty (MDs & PhDs) recruitment and training • Half a day “Train the rater” workshop • >150 faculty members per year • SP training – behavioral (roles) & rating (similar workshop as to faculty members)

The Structure of the Assessment Center Biographical Questionnaire Judgment and Decision-Making Simulations 120 minutes 45 minutes 90 minutes 3 Dilemmas 8 Behavioral stations 21 Questions Essay questions related to candidate’s past experiences Short descriptions of dilemmas that require candidate to make decisions Communication skills Handling of stress Initiative & responsibility Consciousness & self-awareness

Train the Raters workshops • Half a day – mandatory for participation • Groups of 20 faculty each • Include: • Overview of new admission process • Awareness to Biases (Halo, Cultural etc) • Introduction of Rating Scales and Behavioral anchors • Actual rating exercises based on videos of (“standardized”) candidates prepared in advance • “Calibration” of Raters through open discussion of metrics and reference to group ratings

Developmental & Logistical Milestones • Logistics - 98 candidates a day (X 3-4 days) • Two sessions - 5 hours each • Two parallel modules in each session (24 candidates each) • Test Security– Different modules for different days • 32 SPs – all day (1 hour work / 1 hour rest) • 36 faculty members per session (72 per day) • 25 staff members for logistics and administration tasks • Data analysis system & score reporting • By NITE – 3-4 weeks

The Location: MSR – Virtual Hospital 16 9 2 Floor Plan Floor Plan The Location: MSR – Virtual Hospital The Location: MSR – Virtual Hospital Location: MSR – Virtual Hospital

Behavioral (“OSCE Like”) Stations – Rationale • Simulation is the component that makes the assessment center unique in comparison with conventional tests, traditional interviews and questionnaires • Observing people’s present behavior is a better predictor of future behavior than their own subjective account of how they would behave • Since the candidates do not yet possess any professional knowledge, the simulation should not be based on such knowledge • The simulations reflect common situations similar to those encountered by doctors, through which inter-personal and communication skills can be assessed

Behavioral Stations – Structure • 8 behavioral assessment stations • 6 individual stations (7 minutes per station) • 2 group stations (25 minutes per station) • Candidate’s behavior was observed by faculty members who score behavior according to an assessment form consisting of four dimensions of personal characteristics: • Communication skills • Handling of stress • Initiative and responsibility • Consciousness and self-awareness • Scoring was on a scale of 1-6.

Behavioral Stations – Examples (* MMI - Reiter H, Eva K et al, McMaster, Canada) Description A structured mini-interview Groups of 6 candidates perform a task together

Judgment and Decision-Making Questionnaire – Rationale • One of the characteristics medical professionals are expected to possess is a well-developed capacity for moral reasoning • The goal is to examine the candidate’s ability to contend with moral dilemmas… • And to measure the candidate’s ability to comprehend all aspects of a complex situation and reach a considered decision as to how to act

Judgment and Decision-Making Questionnaire – Structure • 3 short scenarios, each describing a real-life situation that raises questions or doubts vis-à-vis the appropriate decision to be made. • The dilemmas have no correct solutions. • Candidates should state the reasons for and against the decision to be made. After providing a detailed account of their considerations, they should state how they would act, and explain their decision. • The score is based on the number of arguments and their quality: complexity, reference to the conflict between law and morality, application of professional and moral considerations simultaneously, and the ability to reach a final, justifiable decision. • Scoring conducted by two independent / trained psychologists

Judgment and Decision-Making Questionnaire – Example • Jane, 32, is a new teacher at a boarding school. After spending a short time at the school she discovers that the older students are conducting humiliating initiation ceremonies for new students and even having them do their chores. Jane reports this to the school principal, who tells her that this is a longstanding tradition at the school and that she should not be concerned. He implies that should she make this information public, she would no longer be trusted by the students or veteran teachers. • What would you advise Jane to do with the information she possesses? Why? What considerations should she take into account?

The Biographical Questionnaire – Rationale • The most reliable and valid predictor of future behavior is past behavior. • In an attempt to estimate qualities and attitudes such as motivation, the tendency to help, consistency, curiosity and leadership, it is reasonable to look to evidence from the candidate’s biographical details, such as hobbies, studies, military service, social activities, and voluntary activities. • The questionnaire is standardized and objective. It is less biased than an interview and is scored according to a detailed scoring guide (by two independent / trained psychologists)

The Biographical Questionnaire – Structure • 21 questions divided into two sections: (1) Past experience – questions regarding experiences and activities during and after high school (military service, job experience, volunteer activities etc.) (2) Emotional awareness – questions regarding past experience in coping with challenging emotional situations

The Biographical Questionnaire – Example Briefly describe a situation in which someone approached you for help/advice and you provided it. • What was the problem for which you were asked to provide help/advice? • Why were you the person approached? • Describe the situation in which you provided help/advice. • Describe how you felt in this situation.

Scoring and Reporting 1st and 2nd Year Data

Candidates Demographics • 283 tested in first year (3 days) and 280 tested in second year (4 days) • Allotted randomly between days (proved to be equal) • Mean age: ~20 years • ~50/50 male/female ratio • Mother tongue – • 75% Hebrew speakers • 20% Arabic speakers • 5% other languages

Scoring of simulation stations Scoring the • Each item was evaluated on a scale of 1 (lowest) to 6 (highest). In cases with two observers the average of the two evaluations was computed. • 4 scores on the 4 aspects: Sum of evaluations relating to each of the factors.

The Assessment Center – 3 Major Components Simulations Biographical Questionnaire Judgment and Decision-Making Dilemma 1 21 Questions Communication skills 1 Dilemma 2 Handling stress 2 Initiative & responsibility Dilemma 3 3 Consciousness & self-awareness 4

The Scoring Process Preparatory workshops for raters – Faculty / SPs and Psychologists Simulations For 3 out of 8 stations: 2 Assessors (Real time assessment) Biographical Questionnaire Two Assessors: (10 assessors overall). A third assessor is added if there is a large discrepancy 90 minutes of evaluation per candidate Judgment and Decision-Making Two assessors: (9 assessors overall). A third assessor is added if there is a large discrepancy 65 minutes of evaluation per candidate

The Scoring Process - Weights We assumed that the populations participating in the three days were similar. Simulations 60% weight in the final score Biographical Questionnaire 20% weight in the final score Judgment and Decision-Making 20% weight in the final score Standardized scale: Mean = 200; SD = 20

Attributes and Behaviors Assessed by Simulation Stations What Do the Simulations Assess? 1. Communication skills a. Ability to convey a message clearly and coherently b. Maintenance of boundaries (respectful attitude towards others) c. Candidate’s ability to engender trust (sincerely convey intentions and limitations) d. Ability to listen (be attentive without interrupting) e. Openness to the other person’s opinion and position -- flexibility f. Ability to behave sensitively and express empathy towards others

Attributes and Behaviors Assessed by Simulation Stations What Do the Simulations Assess? 2. Handling stress a. Coping with the situation (low score – nervous, non-functioning, pressured by time) b. Coping with the task (involved, contends with frustration, does not blame others, non-judgmental) c. Extent to which the candidate maintained a high level of functioning throughout the task

Attributes and Behaviors Assessed by Simulation Stations What Do the Simulations Assess? 3. Initiative and responsibility a. Responsibility and initiative (takes personal responsibility and tries to solve the problem, initiates) b. Extent to which the candidate controls the situation (leads, plans, organizes) 4. Consciousness and self-awareness a. Capacity for introspection (to describe own behavior, emotions and feelings) b. Ability to recognize the ethical complexity of the situation

9 Scores Calculated for Each MOR Candidate • Communication skills (39 items) • Handling stress (15 items) • Initiative and responsibility (10 items) • 4. Consciousness and self-awareness (7 items)

9 Scores Calculated for Each MOR Candidate • General score for the simulation stations, based on the • weighted average of the four dimensions • Judgment and decision-making questionnaire score • Personal/biographical questionnaire score • General MOR score based on agreed upon weights • of the scores in the three previous items (60/20/20) • Final score, calculated as a simple average (50/50) of • the general MOR score and the candidate’s aggregate • score (GPA + PET)

Reliability-Consistency • Literature: • Reliability of simulations around 0.7 – 0.8 • Estimation methods: • Test-retest 0.70 (N=34) • Internal consistency (Cronbach Alpha) • Inter-rater reliability • Methods based on inclusion criteria

Cronbach Alpha Reliability Estimates for MOR Components

Inter-Rater Reliability • Stations • Median inter-rater correlation = 0.58 • Corrected for two evaluators (Spearman-Brown) = 0.72 • J&D (3 dilemmas X 3 days) • Median inter-rater correlation = 0.72 • Biographical Questionnaire (3 Days) • Median inter-rater correlation = 0.94

Participants’ Feedback 1- not at all, 2- to a minor extent, 3- to a great extent, 4- to a very great extent

To what extent is each of the following measures fair as a selection tool for candidates to medical schools? 1- not at all, 2- to a minor extent, 3- to a great extent, 4- to a very great extent

The Scoring Process: Conclusions • The scoring process and score reporting for 300 candidates lasted 3-4 weeks. • Reliability measures - Good inter rater reliability (“train the rater”) and High internal consistency measures • The make-up of the student body changed by 20% • Validity measures– current & future research: • Already in process: comparing qualities of students accepted based on MOR and students accepted by previous selection process. • Planned: a longitudinal validity study in several milestones down the road – pre clinical, clinical, internship…

The Assessment Center – Summary • The Assessment Center –very complex endeavor - successfully conducted for 2 consecutive years (600 candidates). • Very smooth recruitment and enthusiastic collaboration with faculty members –(The Hidden Agenda…) • Both candidates and faculty members expressed high satisfaction as regards the fairness and implementation of the new selection process –High Face Validity • Very high national interest –Technion Medical School– joined the process in 2006. (HU modifying to MMI) • High international interest - encourages more elaborated discussions and research regarding non-cognitive admission methods.

The Assessment Center – Discussion • Potential shortcomings • Small and very select group of candidates which might present low variability in different measures • The validation process is long and difficult • Potential cultural biases • Cost…

Reliability & Validity The Assessment Center – Discussion • Strengths • Theoretical basis • Multiple methods • Multiple independent ratings (20) • Standard measurements • Significant weights (50%) • Other consequences • Social message: to Candidates, Faculty & Public • The make-up of the student body changed by 20% • Dramatic Change in atmosphere at TAU Non-Cognitive Factors Are Important

Simulation-Based Assessment Center for Medical School Candidates

Simulation-Based Assessment Center for Medical School Candidates

Presentation Transcript

Countdown to Examinations - MFL

Polemical Harmony Between Formative Assessment National Examinations

Countdown to Examinations - MFL

Parts Washing Using Bioremediation Technology

National Informatics Centre

Countdown to Examinations – RS

NATIONAL BOARD OF EXAMINATIONS

Examinations

New a pproaches to e conomic c hallenges

Presented by M. G. PRAKASH National Centre for Catalysis Research

Vietnam to Produce More Coffee With Less Water

NAEC

National Clearing Centre

Presented by M. G. PRAKASH National Centre for Catalysis Research

NATIONAL POISONS CENTRE

NATIONAL BOARD OF EXAMINATIONS

Presented to National Defense Industrial Association

Welcome To National Centre for Hallmarking