920 likes | 1.44k Views
Written Assessment. Viera Wardhani. Contents. Why Assessment Important. 1. Type of Assessment & Competency . 2. Written Assessment. 3. Item Analysis. 4. Why Assessment?. Reward system direct performance. Motivation. Stick. Performance. Carrot. Behavior. What do we examine?.
E N D
Written Assessment VieraWardhani
Contents Why Assessment Important 1 Type of Assessment & Competency 2 Written Assessment 3 Item Analysis 4
Reward system direct performance Motivation Stick Performance Carrot Behavior
What do we examine? Subject Learning Outcome Competency Learning Objective Learning Process Examination
Exam drive learning What is BEST Learned Is NOT What is TAUGHT IS What is TESTED
Criteria 4 Good Exam Level of Ability Format of Test Sufficient Appropriate Representative Good Exam Sample of Skill tested Number of Test
Testing what intended to be tested ASSESSMENT & competency
Kratwohl AFFECTIVE Level Menjadiciri Characterization Mengaturdiri Organization Menghargai Valuing Merespon Responding Menerima Receiving
BLOOM COGNITIVE LEVEL CREATE Mencipta Mendesain Merancang EVALUATE Mereview Mengkritisi Menilai ANALYZE Memilah Mengurai Merinci APPLY Menerapkan Menghitung Menggunakan UNDERSTAND Menjelaskan Menerangkan merangkum REMEMBER Mengingat menyebutkan
Harrow COGNITIVE Level Spontanotomatis NATURALIZATION Akuratdancepat ARTICULATION Lancardantepat PRECISION Tanpacontoh MANIPULATION Meniru IMITATION
Knows how (Miller Pyramid)Assessing knowing how Specific Lessons learned! • Simple short scenario-based formats work best (Case & Swanson, 2002) • Validity is a matter of good quality assurance around item construction (Verhoeven et al 1999) • Generally, medical schools can do a much better job (Jozewicz et al 2002) • Sharing of (good) test material across institutions is a smart strategy (Van der Vleuten et al 2004). Does Shows how Knows how Knows
Moving from assessing knows Knows: What is arterial blood gas analysis most likely to show in patients with cardiogenic shock? A. Hypoxemia with normal pH B. Metabolic acidosis C. Metabolic alkalosis D. Respiratory acidosis E. Respiratory alkalosis
To assessing knowing how Knowing How: A 74-year-old woman is brought to the emergency department because of crushing chest pain. She is restless, confused, and diaphoretic. On admission, temperature is 36.7 C, blood pressure is 148/78 mm Hg, pulse is 90/min, and resp are 24/min. During the next hour, she becomes increasingly stuporous, blood pressure decreases to 80/40 mm Hg, pulse increases to 120/min, and respirations increase to 40/min. Her skin is cool and clammy. An ECG shows sinus rhythm and 4 mm of ST segment elevation in leads V2 through V6. Arterial blood gas analysis is most likely to show: A. Hypoxemia with normal pH B. Metabolic acidosis C. Metabolic alkalosis D. Respiratory acidosis E. Respiratory alkalosis
Knows how Assessing knowing how General Lessons learned! • Competence is specific, not generic • Assessment is as good as you are prepared to put into it. Does Shows how Knows how Knows
Miller’s pyramid of competence Does Best Test Written Assessment Shows how Knows how Verbally Mediated Describe Knows Miller GE. The assessment of clinical skills/competence/performance. Academic Medicine (Supplement) 1990; 65: S63-S7.
Threat to Validity in Written Assessment Construct Underrepresentation Construct irrelevant variance Flawed item formats Biased item Reading level of items inappropriate Item too easy/hard/difficult/non discriminating Cheating (insecure item) Indefensible passing score method Teaching to the test • Too few items to sample domain adequately • Biased un-representative sample of domain • Mismatch of sample to domain • Low score reliability
Type of Written Assessment Constructed Response Selected Response MCQ (A type) T/F (X type) Alternate Choice Multiple TF Traditional Matching Extended matching (R type) Complex MC (K type) Testlets • Short answer • Three sentences max • Long answer • Five pages max
Strength Constructed Selected Broad representative content Accurate, objective and reproducible score Defensibility Accurate, timely feedback Secure reuse of banked item Efficient: time, cost, information • Non cued writing • Easy to create • Logic, reasoning, steps in problems solving • Ease of partial credit scoring • In depth assessment
Limitation Constructed Selected Difficult to write well Bad public relation: Guessing Memorable • Subjective human scoring • Limited breath of content • Reproducibility issues • Inefficient • Limited psychometric & quality control
Example of CR • Short: • Name and describe the function of each of the bones of the human inner ear • Long: • Discuss the human inner ear, describing in detail on how the inner structure relates to hearing
Preparing CR • Question stems: • Clarity in meaning • Depth of understanding • Number of questions • Representative of content/domain to be tested • Scoring: • Preparing best answer model (key answer) • Structuring scoring: • Analytic: composite score • Holistic : one score
Research has found that: difficult to write well more likely to be misinterpreted more subject to ambiguity often sample less content more likely to be discarded by medical test committees after review Often assumed that: simple to prepare easy to understand little ambiguity more content sampled widely adaptable T/F (X - type) MCQ’s
NBME, p.15 To answer T/F ( X-type) items: • 1. Examinee needs to know the content • 2.May also have to decide to what extent a choice is ‘true’ • if options are not completely true or completely false, thestudents have toguesswhat the item writer had in mind • this latter judgement can be unrelated to their clinical or scientific expertise
Stems must not have imprecise phrases ‘is associated with’ ‘is useful for’ cues ‘may be’ ‘could be’ vague terms ‘usually’ ‘frequently’ Options must be: absolutely true or false NBME, p.16 Criteria for T/F MCQ’s
Trying to avoid ambiguity in T/F items, medical item writers often find themselves being pushed toward assessing the recall of isolated facts - NBME, p.18 A word of caution:(regarding trying to write good T/F questions) Something normally to be avoided
A true/false decision is not necessarily a trivial exercise ( e.g., deciding if a verdict of guilty or innocent is correct decision) But writing a T/F question to measure the complex skills of why one arrived at a T/F decision is very difficult. Caution:
B A D C Totally Wrong Totally Correct (False) (True) NBME, p.14 Good example Which of the following is/are X-linked recessive conditions? Hemophilia A (classic hemophilia) Cystic fibrosis Duchenn’s muscular dystrophy Tay-Sachs disease
Totally False Totally True A B C E D O & T (Multiple T/F item) The following are autosomal dominant conditions. A. fibrous dysplasia B. osteogenesis imperfecta C. osteoporosis D. Duchenne’s muscular dystrophy E. achondroplasia
False True A B C D A,B,C are not absolutely true or false. Experts would only unanimously agree that D is true. Flawed example Which of the following statements about Osteoarthritis are True and which are False? A. there is a genetic pre-disposition B. males are more affected than females C. hip joints are more affected than knee D. obesity is a risk factor in its development
NBME, p.14 True statements about cystic fibrosis (CF) include: The incidence of CF is 1:2000 Children with CF usually die in their teens Males with CF are sterile CF is an autosomal recessive disease A, B, and C are not absolutely true or false; experts can’t agree
Examine 2nd major type of MCQs Best One of ‘N’ Choices ( “A”-type Selected-response)
A-type writers also face problems in composing a good itemIf we can avoid making these common mistakes, students’ test wisenesswill not be a major issue
1. Avoid grammatical cues: e.g., one or more distracters may not follow grammatically from the stem NBME, p.19? much like a typical medical student you too can figure out that some options could not possibly be the correct answer; in so doing, you have a better chance selecting correct answer without knowing the answer.
NBME, p. 22-26 ? 2. Avoid irrelevant techniques for making an item more difficult • options are sometimes made unnecessarily complicated • - numeric data are not stated in a consistent unit • A. 20 mg B. 40 g C. 45 oz D. 50 oz • - options are in non logical (or non sequential) order • A. 120 ml B. 100 ml C. 150 ml D. 115 ml In valid tests, difficulty level depends on the level of clinical & scientific reasoning
Most of the time Usually Frequently Likely to occur Probably Commonly Associated with Rarely Almost never NBME, p. 22-26 ? 3. Avoid using these imprecise terms in the stem and/or choices: Consider the variation writers have in what these terms mean.
NBME, p. 22-26 ? 4. Avoid other common item writer mistakes: a. logical cues (subset of options is collectively exhaustive) b. absolute terms (‘always’, ‘never’) c. long correct answer (phrased to be qualified) d. word repeats (in stem and the correct choice) e. convergence strategy (correct answer includes the most elements that are in common with other options)
Having seen what not to do, examine what should be done
Criteria to meet in writing A-type items • Stem asks a question & a knowledgeable student can answer question without looking at set of answer choices • A common continuum underlies all choices of answers • thereby enabling a “best” answer to be found • All distracters are plausible • All distracters are relatively same length as the answer • Question deals with important concept(s) • i.e., does not deal with trivial fact(s)
(Clinical Vignette) A 32-year-old male has a 4-day Hx of progressive weakness in his extremities. He has been healthy except for an upper respiratory tract infection 10 days ago. His temperature is 37.8 oC, blood pressure is 130/80 mm Hg, pulse is 94/min and respirations are 42/min and shallow. He has symmetric weakness of both sides of the face and the proximal and distal muscles of the extremities. Sensation is intact. No deep tendon reflexes can be elicited; the plantar responses are flexor. (Lead-in) Which of the following is the most likely diagnosis? Example: good A-type item
Which of the following is the most likely diagnosis? (Alternatives): A. Acute disseminated encephalomyelitis B. Guillain-Barre syndrome C. Myasthenia gravis D. Poliomyelitis E. Polymyositis (*)
NBME, p.17 Even though the incorrect answers are not completely wrong, they are less correct than the ‘keyed’ answer. D C A E B _____________________________________________ Least Most Correct Correct
(Clinical Vignette) A 10-month-old infant takes his evening feeding normally and falls asleep. Two hours later he awakens, cries for 10 minutes, and vomits a small amount of thin greenish fluid before falling asleep again, Two hours later he awakens again, cries harder, then has a bloody bowel movement. (Lead-in) Which of the following is the most likely diagnosis? Surgery Example of Good A-type item:
Surgery Which of the following is the most likely diagnosis? A. Bleeding Meckel’s diverticulum. B. Intussusception. C. Midgut volvulus. D. Intestinal obstruction due to ingested foreign body. E. Gastroenteritis. (*)
Basic Rules for Writing A-type Items • Focus on important concept • e.g., a common or potentially catastrophic clinical problem • Include items that assess application of knowledge • not only recall of isolated facts • Pose a clear question • student can arrive at an answer with options covered • Make all distractorshomogeneous • common continuum exists for all choices • Avoid technicalflaws in composition • see previous slides & Appendix 1 re errors to avoid
Thus: A-type items are also difficult to write(not unlike T/ F) But when well written, A-type less often forces the writer to deal with isolated facts
Issue of Sampling: A- vs. X-type items • If the number of questions is kept constant in two different exams • A-type will normally sample more broadly the skills of interest • But remember: • 100 A-type items take longer to answer than 20 stems each with 5 T/F choices • average of 0.75 min per question, if English is mother tongue; • for other students who are examined in English, but whose 1st language is not English, then the time needed is ???