200 likes | 300 Views
The trouble with resits …. Dr Chris Ricketts Sub-Dean (Teaching Enhancement), Faculty of Technology and Director of Assessment, Peninsula College of Medicine and Dentistry but School of Mathematics and Statistics. Outline. The coincidences that led me here
E N D
The trouble with resits … Dr Chris Ricketts Sub-Dean (Teaching Enhancement), Faculty of Technologyand Director of Assessment, Peninsula College of Medicine and Dentistry but School of Mathematics and Statistics
Outline • The coincidences that led me here • Something about educational measurement • The literature • Some theory • The question to which I don’t know the answer (yet!)
(Mental) health warning • Mostly theory and speculation, no results.
Background • Had been working on ‘domain referenced testing’ in PCMD. • Had been thinking about ‘progress testing’. • Chairing university’s assessment review. • Received a paper with the title ‘The trouble with resits…’ to review. • Started to think … • Time to share my problem!
The ‘progress test’ prompt • A ‘progress test’ is a test set at graduation level but sat by students in all years.
The ‘progress test’ prompt • Concept goes back to Goulet (1955). • First practical application described by Arnold & Willoughby (1990). • My question - How can we use prior information (results on previous tests) to improve our estimate of what a student currently knows?
The referee prompt • Looked at resits in clinical examinations • Claimed that ‘it would be a brave assessment team which set a higher pass-mark for a resit …’ • My question – Why do we not do this?
Educational measurement • Educational measurement … is better conceived of as testing student performance on a sample of tasks from the area for purposes of predicting the extent of satisfactory performance in the area as a whole. Bock, Thissen & Zimowski (1997)
Educational measurement vs. competency testing? • Competency testing means you can do a specific task. Multiple tries are sensible. • (Perhaps we need repetitive competency assessments? Is a sample of one enough?) • This is different from educational measurement. • Educational measurement is an inference problem. • Take a sample of tasks representative of the whole domain. • On the basis of the performance on the sample we make inference about the whole domain.
The trouble with resits … • A resit is another sample. How should we treat it?
The literature on re-sits • There’s some info about what people do but very little about why. • Educational Measurement: Issues and Practice has nothing. • Journal of Educational Measurement has nothing. • Assessment and Evaluation in Higher Education has nothing. • “Measurement and assessment in teaching” Linn & Gronlund has nothing. • Can you help?
Some theory (1) • All educational measurements are made with uncertainty • This is usually described as the ‘Standard error of measurement’. • The aim is to come to reliable decisions, usually implying a measurement with a small standard error.
Inference and uncertainty • Uncertainty arises because of 1) the sample, 2) other sources of error. • If someone fails, is this because the sample is inappropriate for them? - there is “case specificity”. • A resit is another sample. How should we treat it?
Some theory (2) • Adaptive testing or multi-stage testing. • In classical adaptive testing we give a student a task of average difficulty. • If they pass they the get a harder task. This gives more information at their particular ability level. • If they fail, they get an easier task. • Assumes ‘unidimensionality’. • Students who hover around the pass mark generally sit longer tests to reduce the standard error of measurement.
Some theory (3) • In multi-stage testing we give a student a sample of tasks. • If they are a clear pass or fail the test ends. • Students near the pass mark are given another sample of tasks. If the combined sample gives a clear pass/fail decision then the test stops. • If there is still too much uncertainty, another sample of tasks is given. • Again, students who hover around the pass mark generally sit longer tests to reduce the standard error of measurement.
How should we treat resits?? • Is a resit an independent sample? That’s how we (and everyone else) treat it. • Or is a resit a second sample? Are we using it to increase the sample size?
Should we use prior information?? • After the first test we have an indication that a student who fails has not mastered the content/tasks that we expect. • Should we use that information when we assess the resit? • If we should, how would it work?
How would it work? • The student mark on the combined first attempt and resit is used to make the pass/fail decision.
How would it work? • Students who narrowly fail on the first attempt would only have to improve slightly to pass. • Students who fail badly on the first attempt would have to improve substantially to pass.
Implications and discussion • I need help! • Your thoughts? • Anyone know any literature on the ‘Theory of resits’???