320 likes | 589 Views
Computer Based Testing of Medical Knowledge. Tom Mitchell, Nicola Aldridge Intelligent Assessment Technologies Ltd. Walter Williamson Faculty of Medicine, University of Dundee. Peter Broomhead Brunel University. Overview.
E N D
Computer Based Testing of Medical Knowledge. Tom Mitchell, Nicola Aldridge Intelligent Assessment Technologies Ltd. Walter Williamson Faculty of Medicine, University of Dundee. Peter Broomhead Brunel University.
Overview. • Project carried out in Medical School at Dundee University in autumn 2002 / spring 2003. • Computerisation of an existing paper-based test of medical knowledge. • Test comprised of 270 short-answer free-text items. • Marking of the paper-based tests consumed unsustainable amount of faculty resources. • Computer system developed and rolled-out for 2003 tests.
Background The GMC defines “core” knowledge which is essential for a medical student. • The Medical School at Dundee has implemented this by teaching to 12 learning outcomes. • Assessment of the course involves written and practical tests. • The GMC review team rated Dundee “Excellent”, but also recommended a new assessment to improve student feedback and course audit : A Progress Test.
Progress Tests. What is a Progress Test ? • A comprehensive assessment of medical knowledge. • Inform students about their year-on-year progress against learning outcomes. • Highlight gaps in their knowledge, and their performance relative to their peers. • At Dundee the Progress Test is administered annually throughout the five years of the undergraduate programme – each year group sits same test.
The Dundee Progress Test. Piloted in April / June 2001. • Test designed by Professor M. Friedman. • MCQ discounted : • Testing recall of knowledge, not recognition. • “A doctor does not get five choices.” • Many US schools moving to open-ended format. • The first test comprised 250 short-answer free-text items. Longer term aim is to build up a bank of items.
Progress Test Items (1). Items are short-answer free-text text. • What simple clinical test can distinguish between solid and cystic scrotal swellings ? • Accept : Transillumination, shining light through swelling. • Allow: Light goes through cyst. • Don’t accept on own: shine light at/on/behind…
Progress Test Items (2). Free-text text responses… 1 transillumination of the area with a light source in a darkened room, cystic lesions will transilluminate but solid ones wont. 1 shine a light through it - cystic lesions allow light through, solid lesions don't 1 Illumination - can light pass throught the swelling - cystic if it does 1 shine a torch behind the swelling. cystic swelling will transilluminate 1 using a torch to shine a light through the swelling 1 Tranillumination of the scrotum with a torch 1 trans illumination of the scrotal swelling 0 using a pen torch to illuminate the swelling 0 illumination of the swelling using a light source
Paper-Based Testing (1). • 150+ students per academic year, 750 - 800 students in total. • 3 hour test, 250 – 270 short-answer free-text items. • Admin : Print, collation, etc. of different 30 page test booklets (items in different order), test admin, script storage etc. • Marking : 800 scripts, 750 x 240 = 180,000 items to mark + data entry, rapid feedback required. • Plus, moderation of marking guidelines required.
Paper-Based Testing (2). Moderation • To achieve consistent marking, the marking guidelines must be moderated in light of real student responses. • Approach at Dundee was to use Year 5 marking process to moderate marking guidelines. • Group of senior academics mark Year 5, the resulting marking guidelines are used to mark all other years by a team of 6 markers.
Paper-Based Testing (3). Problems with the paper-test. • Moderation. Script-by-script marking is tedious and inefficient way to moderate marking guidelines, and required significant time element from senior academics. • Marking.≈160 scripts per year group, a team of 6 markers can together mark around 15 scripts per hour. ≈ 30 man-days just to mark scripts. • Admin. Data entry for 180,000 marks. • Feedback. Due to the intensity of work required, timely feedback was not achieved. Conclusion : Paper-based progress test was “unsustainable”.
A Computerised Pilot (1). Computerised pilot ran in autumn 2002 : • To assess the reaction of the students to a computerised progress test; • To examine the accuracy of computerised marking for progress test items; • To contribute towards defining the specification of a full system. The pilot system used IAT’s free-text marking engine, AutoMark (see 2002 CAA paper).
Computerised Marking • How do we mark free-text responses by computer ? • IATs Marking Engine does not operate on raw text, but on the output of a sentence analyser.
A Computerised Mark Scheme How do we represent the mark scheme ? • Each mark scheme answer is represented as a template. • Each template specifies one particular form of acceptable or unacceptable answer.
A Computerised Pilot (2). The Pilot. • Computerised mark schemes were developed for 25 items used in previous years’ progress tests. • An online test comprising the items was delivered to approximately 30 students in November / December 2002. • Student responses were computer marked, and the marking accuracy analysed. • The error in computerised marking was ≈1%. • Student feedback from the pilot was positive.
After Moderation. Subsequent to moderation of marking guidelines. • Where necessary, computerised mark schemes were re-worked. • Any outstanding tests were re-marked, and the results output. • The re-worked computerised mark schemes are now considered “moderated”, and can be used to mark future tests with a high level of confidence.
Conclusions on Moderation. The academics’ view : • Being able to view all student responses to an item together is a major advantage. • The process of moderation via computer is actually a positive experience for academics – could lead to better item writing. • On-screen moderation was quicker than expected, responses could be scanned quickly, and most items required little input • Computer-assisted moderation is a significant improvement over the previous “ordeal”.
Accuracy of Marking (1). Data from Year 5 Moderation. • 5.8% of marks changed by moderators. • Most (4.2%) due to omissions in original marking guidelines or problems in item wording. • Only 1.6% due to errors in computerised marking. After Re-Working the Comp. Mark Schemes. • Agreement between moderated marks and computerised marking 99.4% for Year 5. • 0.6% error due to system errors in marking engine.
Accuracy of Marking (2). Responses from 10 Year 2 and Year 3 students selected at random, and hand marked. Mean error from the sample was 0.22%, highest error 0.74%
Accuracy of Marking (3). As a further check, 4 Year 5 students chosen. • Two who had unexpectedly over-performed, two who had unexpectedly under-performed. • Responses hand marked. • No discrepancies between human and computer marking encountered.
Human vs. Computerised Marking. Hand-marking the progress test is onerous. • 800 scripts, 270 items per script, a team of 6 markers can mark approx 15 scripts per hour. • The error in hand marking has been measured at between 5% and 5.5% (two studies). • This is comparable with unmoderated computerised marking (5.8%). • Moderated computerised marking is significantly better - of the order of 1%.
Conclusions (1). Advantages of the computerised system include: • Moderation less painful, and more productive. • After sample-based moderation, re-marking takes hours, not weeks of work. • For this test, marking accuracy is actually improved. • Production of reports automated, data entry not required. • Moderated items can be re-used in future tests. • Flexibility of test-taking is greatly increased.
Conclusions (2). The model of computerised marking and computer-assisted moderation can benefit CAA. • Enables use of educationally valued free-text items. • “Credibility-gap” addressed – marking can be checked and moderated on a sample of the cohort. • Enables banks of moderated free-text items to be assembled. • Moderation process benefits item-writing – better assessment, not just better CAA.
Future Work. Project : • Complete testing of remaining 150+ students. • Add new items for next year’s tests. Technology : • Enable item writers / academics to create, test, and modify computerised mark schemes. • Integrate marking / moderation functionality with QuestionMark Perception.
Computer Based Testing of Medical Knowledge. www.IntelligentAssessment.com