460 likes | 749 Views
Transforming Multiple Choice Questions to Effectively Assess Application of Knowledge. STReME Series, August 11, 2011 Brenda Roman, MD, Professor of Psychiatry, BSOM Paul Koles, MD, Associate Professor of Pathology and Surgery, BSOM. Journey through Lunch. Power and Purposes of Assessment
E N D
Transforming Multiple Choice Questions to Effectively Assess Application of Knowledge STReME Series, August 11, 2011 Brenda Roman, MD, Professor of Psychiatry, BSOM Paul Koles, MD, Associate Professor of Pathology and Surgery, BSOM
Journey through Lunch • Power and Purposes of Assessment • Learning Approaches and Assessment • Assessment Using Multiple-Choice Questions (MCQs) • Evaluation of MCQ Quality • Identification of Flaws in MCQs • Practice: Find the Flaws • Practice: Choose the Highest-Quality MCQ
Q1: Of the criteria listed below, which one do you believe is most important for judging the quality of a multiple choice question (MCQ)? • The MCQ assesses knowledge that is considered important by the writer of the question. • The MCQ is directly related to one or more of the course’s learning objectives. • The MCQ asks the student to make a decision that is based on critical interpretation of data. • The MCQ requires the student to appropriately apply knowledge, not just to recall facts.
Flaws in the previous MCQ • Options • Non-homogeneous options: (a) (b) about content; (c) (d) about format and purpose • Unnecessarily long • Only (d) has a contrasting clause • Stem • question can’t be answered if the answer options are covered up • “judging the quality” which aspect of quality? • “do you believe” implies that the best answer is a matter of personal opinion (there is no single best answer) • Q1: Of the criteria listed below, which one do you believe is most important for judging the quality of a multiple choice question? • The MCQ assesses knowledge considered important by the writer of the question. • The MCQ is directly related to one or more of the course’s learning objectives. • The MCQ asks the student to make a decision that is based on critical interpretation of data. • The MCQ requires the student to appropriately apply knowledge, not just to recall facts.
Power of Assessment • “Assessment drives student learning. Student assessment can be designed to foster the development of elaborated knowledge structure by making relationships and understanding—rather than isolated facts—the objects of assessment.” Bordage G: Elaborated Knowledge: A Key to Successful Diagnostic Thinking. Acad Med 69:883-885, 1994
Purposes of Assessment (using written questions) • Assumption: performance on a sample of questions allows inferences about the skills of examinees in a broader domain • Communicate what instructor views as important • Motivate students to learn • Allow objective comparisons among students who often experience variations in curriculum • Compensate for instructional gaps by encouraging students to read broadly and utilize a variety of educational tools Case SM, Swanson DB; Constructing Written Test Questions for the Basic and Clinical Sciences, 3rd edition, NBME 2002
Assumption Refuted: Physicians who pass licensure exams may lack some essential skills for practicing medicine
Learning Behavior • Learning behavior: “. . .the set of cognitive and metacognitive processes that learners draw on to acquire knowledge, skills, and understanding” (Mitchell R; Acad Med 84:918-926, 2009) • 424 residents from 7 IM residencies completed a cognitive behavior survey (140 items, 7 point Likert scale) • Seven learning behavior scales developed from survey data: memorization, conceptualization, reflection, independent learning, critical thinking, meaningful learning experience, attitude toward educational experience • RESULTS • Memorization not correlated positively with other 6 scales • Memorization correlated negatively with critical thinking • Residents in top 20% on reflection scale also conceptualized, learned independently, and thought critically more than the bottom 20%
Competent Physicians • Integrate: “to bring together parts into a whole” (Webster’s)
Assessment in Medical Education • Primary purpose: measure student’s competence in course, clerkship, or residency • Secondary purpose: develop competent physicians • Motivate student to integrate new knowledge with previously mastered knowledge (longitudinal learning) • Foster critical thinking skills (clinical decision-making) • Impart direction for future learning (subliminal messages embedded in assessments)
Learning Approaches and Assessment • Students adapt learning approaches to context in which learning occurs • Three basic approaches identified • Surface (memorization) • Deep (comprehension and application) • Strategic (adapted to meet perceived expectation of faculty) • Teaching methods influence students’ approach to learning • Some teaching methods hinder development of deep learning approach • Education of competent physicians requires “substantial changes in teaching, curriculum and, particularly, assessment . . .” Newble DI, Entwistle NJ: Learning Styles and Approaches: Implications for Medical Education. Medical Education 1986; 20:162-175)
Can MCQs assess learner’s ability to apply knowledge by critical thinking and problem solving?
* * Bloom’s taxonomy of cognitive learning collapsed into 3 levels: (1) knowledge; (2) comprehension and application; (3) problem solving
MCQs using clinical vignettes in the stem • “Questions with rich descriptions of clinical context invite the more complex cognitive processes that are characteristic of clinical practice.” • “Conversely, context-poor questions can test basic factual knowledge but not its transferability to real clinical problems.” Epstein RJ: Assessment in Medical Education, New England Journal of Medicine 2007; 356:387-396.
“There is nothing new under the sun” (Ecclesiastes 1:9) • “No teaching should be done without a patient for a text.” (Osler William: On the Need of A Radical Reform in our Methods of Teaching Medical Students; Medical News 82:49-53, 1904.) • NBME announcement 2010-2011: decision to use only clinical or experimental vignette formats on USMLE step 1.
Format of Clinical Vignette • Outline (not all parts necessary) • Age and gender (“42-year-old woman”) • Site of care (“comes to the emergency department”) • Presenting complaint (“because of headache”) • Duration (“has persisted for 2 days”) • Past history (may not be relevant) • Physical findings (“pulsating artery anterior to ear”) • +/- diagnostic studies; +/- treatments • Example • “What area is supplied with blood by the posterior inferior cerebellar artery? • “A 62-year-old man develops left-sided limb ataxia, Horner’s syndrome, nystagmus, and loss of appreciation of facial pain and temperature sensations. Which of the following arteries is most likely to be occluded?”
How good is this MCQ? • Subjective methods to evaluate quality • Opinion of question author • Opinions of other content experts • Opinions of experienced MCQ writers • Opinions of students (pre-test, post-test) • Systematic identification of flaws by question author and trusted consultants (YOU ARE THE CONSULTANTS!) • Gold standard: performance of MCQ in an exam, as demonstrated by difficulty index and discrimination factor
Gold Standard: Performance of MCQ on an examination Year N diff. index top 25% bottom 25% disc.factor answer A B C D E Difficulty index: percentage of examinees who answered the question correctly Discrimination Factor: how well the item discriminates between students who performed highest on the exam (top 25%) and students who performed lowest on the exam (bottom 25%). Higher D.F. suggests item is a more reliable measure of competence
Systematic Identification of Flaws in MCQs 5 common flaws in stems B) 7 common flaws in answer options
Systematic Identification of Flaws Pre-Exam: MCQ Stems A1. Stem does not end with a question (lead-in) that can be answered by covering up answer options. A 39-year-old female is seen for an annual exam. She had been on oral contraceptive pills as a teenager but discontinued that form of contraception over 15 years ago. Because of her contraceptive practice she has . . . According to the best scientific evidence available to date, HIV-1 came from . . . Prostate cancer is best treated . . . Corticosteroid therapy . . .
Systematic Identification of Flaws Pre-Exam: MCQ Stems A2. Stem is unnecessarily complicated—too long, lots of irrelevant information. • A 48-year-old woman presents to the physician with lower back pain. She states that she has had the pain for about 2 weeks and that it has become steadily more severe. An x-ray film shows a lytic bone lesion in her lumbar spine. Review of systems reveals the recent onset of mild headaches, nausea, and weakness. Her CBC shows a normocytic anemia, and her erythrocyte sedimentation rate is elevated. Urinalysis shows heavy proteinuria, and a serum protein electrophoresis shows a monoclonal peak of IgG. Which of the following is responsible for this patient’s spinal lesioins? • Bence-Jones protein • lymphoplasmacytoid proliferation • osteoblast activating factor • osteoclast activating factor • primary amyloidosis
Systematic Identification of Flaws Pre-Exam: MCQ Stems A3. Stem contains vague terms that invite a wide range of interpretations. A B-cell-deficient toddler recovers as well as a normal child does to infection with the chickenpox virus. This child's immune system is capable of developing . . .
Systematic Identification of Flaws Pre-Exam: MCQ Stems • A4. Stem contains abbreviations that are not clearly understood by all examinees. A 32yo WF in her 1st trimester of pregnancy experiences GERD 3-4x/week and c/o heartburn. She has not responded to MOM. Which medication will be best to treat this patient?
Systematic Identification of Flaws Pre-Exam: MCQ Stems • A5. Stem contains words about quantity that are difficult or impossible to quantify: probably, usually, infrequently, sometimes, in most cases, in few cases, etc. In most cases, men who develop prostate cancer usually have limited dietary intake of which of the following food groups?
Systematic Identification of Flaws Pre-Exam: MCQ Answer Options B1. One or more options do not follow grammatically from the stem. • Which of the following behaviors is most frequently observed in adolescents who smoke cigarettes? • intelligence quotient below 80 • overeating • body mass index < 25 • disrespect for authority • alcohol abuse
Systematic Identification of Flaws Pre-Exam: MCQ Answer Options B2. Options are heterogeneous in language or domains. • Which is necessary for the development of Burkitt lymphoma? • creation, by translocation, of a bcr/abl fusion gene in B-lymphocytes • deletion of p53 tumor suppressor gene in B-lymphocytes • infection of B-lymphocytes by Epstein-Barr virus • over-expression of the c-myc oncogene in B-lymphocytes • trisomy of chromosome 8
Systematic Identification of Flaws Pre-Exam: MCQ Answer Options B3. Option includes absolute terms that make it unlikely to be correct: “always”, “never” • In patients with advanced dementia due to Alzheimer disease, the memory defect • can be treated adequately with phosphatidylcholine (lecithin). • could be a sequela of early parkinsonism. • is never seen in patients with neurofibrillary tangles in the cerebral cortex. • is never severe. • possibly involves the cholinergic system.
Systematic Identification of Flaws Pre-Exam: MCQ Answer Options B4. Correct option is longer, more specific, or more complete than other options (“sore thumb”). • Secondary gain is • synonymous with malingering. • a frequent problem in obsessive-compulsive disorder. • a complication of a variety of illnesses and tends to prolong many of them. • never seen in organic brain damage.
Systematic Identification of Flaws Pre-Exam: MCQ Answer Options B5. correct option contains the most elements in common with other options (“convergence”). • Intramedullary destruction of red blood cells in beta-thalassemia is best explained by which mechanism? • beta-4 tetramer oxidation and precipitation • excessive iron accumulation in macrophages • increased formation of alpha chain aggregates • increased formation of Hb H (beta 4) • increased formation of Hb F (alpha 2 gamma 2)
Systematic Identification of Flaws Pre-Exam: MCQ Answer Options B6. Options are long, complicated, or composed of 2-3 parts, imposing irrelevant difficulty. • The figure below shows the dose-response curves for four different derivatives of a muscarinic receptor agonist. Each derivative acts by binding to the same site on the muscarinic receptor. The Heptyl derivative • has a lower binding affinity for the receptor than does the Hexyl derivative. • has a lower intrinsic activity than does the Hexyl derivative because it has a lower receptor affinity. • is a full agonist when compared with the Octyl derivative. • is more potent than the Hexyl derivative. • may act as a mixed agonist-antagonist if it has a higher receptor affinity than the Hexyl derivative.
Systematic Identification of Flaws Pre-Exam: MCQ Answer Options B7. Options contain words about quantity that are difficult or impossible to quantify: probably, usually, infrequently, sometimes, in most cases, in few cases, etc. • Severe obesity in early adolescence • usually responds dramatically to dietary regimens. • often is related to endocrine disorders. • has a 75% chance of resolving spontaneously. • shows a poor prognosis. • usually responds to pharmacotherapy and intensive psychotherapy.
Systematic Identification of Flaws Pre-Exam: MCQ Answer Options B8. “none of the above” or “all of the above” is used as an option. • Which of the following cities is closest to New York City? • Boston • Chicago • Dallas • Los Angeles • None of the above
Identify those flaws: Practice MCQ 1 • P1) Which of the following applies to pseudogout? • It occurs frequently in women. • It is seldom associated with acute pain in a joint. • It may be associated with a finding of chondrocalcinosis. • It is clearly hereditary in most cases. • It responds well to treatment with allopurinol. • P1) Of 13 flaws listed in your worksheet, how many flaws are present in this MCQ? • 1 • 2 • 3 • 4 • 5
Identify those flaws: Practice MCQ 2 • P2) A 17-year-old male presents with a two-year history of "severe" acne. He has previously been treated with numerous topical treatments and several different oral antibiotics. Multiple nodules and cysts are present diffusely on the face, shoulders, back, and upper chest. He has multiple depressed scars on the cheeks. He is administered an oral agent which leads to significant improvement in his condition. This agent works by • disruption of bacterial cell membranes. • exfoliation. • increased sebum production. • reduction of androgen levels. • suppression of sebum production. • P2) Of 13 flaws listed in your worksheet, how many flaws are present in this MCQ? • 1 • 2 • 3 • 4 • 5
Identify those flaws: Practice MCQ 3 • P3) A 25-year-old woman consults her physician because she has decided to use oral contraceptives. After the physician asks about history of thrombophlebitis, pulmonary embolus, and smoking (all negative), he proceeds to physical exam: Vital signs: within normal limits Height 4'0" Weight 85 lbs. HEENT: large head with prominent, rounded forehead Heart, Lungs, Abdomen: within normal limits Extremities: short arms and legs (compared to trunk length). He writes a prescription for oral contraceptives, but also records her most likely physical diagnosis in the chart. Which molecular abnormality best explains her diagnosis? • constitutive activation of fibroblast growth receptor 2 • constitutive activation of fibroblast growth receptor 3 • expansion mutation in HOXD13 with altered length of transcription factor • mutation in COL1A1 with deficient synthesis of type 1 collagen • mutation in COL2A1 with deficient synthesis of type 2 collagen • P3) Of 13 flaws listed in your worksheet, how many flaws are present in this MCQ? • 1 • 2 • 3 • 4 • 5
High-Quality MCQ: in principle A high-quality multiple-choice question is one that assesses content considered to be important, is free of flaws in both stem and options, and effectively identifiesthose who can use their knowledge to skillfully assess data and make decisions.” (modified from Case SM Swanson DB: Constructing Written Test Questions for the Basic and Clinical Sciences, National Board of Medical Examiners, 2002)
Statistical Definition of High-Quality MCQs: ones that perform well on an exam, as judged by difficulty index and discrimination factor Year N diff. index top 25% bottom 25% disc.factor answer A B C D E Discrimination Factor: how well the item discriminates between students who performed highest on the exam (top 25%) and students who performed lowest on the exam (bottom 25%). Higher DF suggests item is a more reliable measure of competence. Difficulty index: percentage of examinees who answered the question correctly
Mastery MCQs The data below show performance of 3 MCQs used in a final course exam for BSOM year 2 students. All three assessed the same content domain. All three were classified as “mastery” questions (answered correctly by ≥ 90% of students) QM1) Based on the performance data shown below, which one is the highest-quality MCQ?
Intermediate Difficulty MCQs The data below show performance of 4 MCQs used in a final course exam for BSOM year 2 students. All four assessed the same content domain. All four were classified as “intermediate difficulty” questions. (answered correctly by 70.0 – 89.9% of students) QM2) Based on the performance data shown below, which one is the highest-quality MCQ?
Challenging MCQs The data below show performance of 3 MCQs used in a final course exam for BSOM year 2 students. All 3 assessed the same content domain. All 3 were classified as “challenging” questions. (answered correctly by <70 % of students) QM3) Based on the performance data shown below, which one is the highest-quality MCQ?