1 / 58

Lessons learned in assessment

Assessment of medical competence: Past, present, and future? Inaugural address as Honorary Professor in Assessment in Medical Medical Education Department of Surgery and Internal Medicine University of Copenhagen Cees van der Vleuten Maastricht University

brook
Download Presentation

Lessons learned in assessment

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Assessment of medical competence: Past, present, and future? Inaugural address as Honorary Professor in Assessment in Medical Medical Education Department of Surgery and Internal Medicine University of Copenhagen Cees van der Vleuten Maastricht University School of Health Professions Education (www.she.unimaas.nl) Maastricht University Faculty of Health, Medicine & Life Sciences The Netherlands Presentation: www.fdg.unimaas.nl/educ/cees Lessons learned in assessment 24 September 2009 Cees van der Vleuten, Maastricht University, The Netherlands (www.she.unimaas.nl) moving beyond the psychometric discourse

  2. Overview of presentation • Where is education going? • Lessons learned in assessment • Areas of development and research

  3. Where is education going? • School-based learning • Discipline-based curricula • (Systems) integrated curricula • Problem-based curricula • Outcome/competency-based curricula

  4. Where is education going? • Underlying educational principles: • Continuous learning of, or practicing with, authentic tasks (in steps of complexity; with constant attention to transfer) • Integration of cognitive, behavioural and affective skills • Active, self-directed learning & in collaboration with others • Fostering domain-independent skills, competencies (e.g. team work, communication, presentation, science orientation, leadership, professional behaviour….).

  5. Where is education going? Instruction design theory Cognitive psychology • Underlying educational principles: • Continuous learning of, or practicing with, authentic tasks (in steps of complexity; with constant attention to transfer) • Integration of cognitive, behavioural and affective skills • Active, self-directed learning & in collaboration with others • Fostering domain-independent skills, competencies (e.g. team work, communication, presentation, science orientation, leadership, professional behaviour….). Collaborative learning theory Cognitive load theory Empirical evidence

  6. Where is education going? • Work-based learning • Practice, practice, practice…. • Optimising learning by: • More reflective practice • More structure in the haphazard learning process • More feedback, monitoring, guiding, reflection, role modelling • Fostering of learning culture or climate • Fostering of domain-independent skills (professional behaviour, team skills, etc).

  7. Where is education going? • Work-based learning • Practice, practice, practice…. • Optimising learning by: • More reflective practice • More structure in the haphazard learning process • More feedback, monitoring, guiding, reflection, role modelling • Fostering of learning culture or climate • Fostering of domain-independent skills (professional behaviour, team skills, etc). Deliberate Practice theory Emerging work-based learning theories Empirical evidence

  8. Where is education going? • Educational reform is on the agenda everywhere • Education is professionalizing rapidly • A lot of ‘educational technology’ is available • How about assessment?

  9. Overview of presentation • Where is education going? • Lessons learned in assessment • Areas of development and research

  10. Miller’s pyramid of competence Lessons learned while climbing this pyramid with assessment technology Does Shows how Knows how Knows Miller GE. The assessment of clinical skills/competence/performance. Academic Medicine (Supplement) 1990; 65: S63-S7.

  11. Knows how Knows Assessing knowing how Does Shows how 60-ies: Written complex simulations (PMPs) Knows how Knows

  12. Key findings written simulations(Van der Vleuten, 1995) • Performance on one problem hardly predicted performance on another • High correlations with simple MCQs • Experts performed less well than intermediate experts • Stimulus format more important than the response format

  13. Knows how Assessing knowing how Specific Lessons learned! • Simple short scenario-based formats work best (Case & Swanson, 2002) • Validity is a matter of good quality assurance around item construction (Verhoeven et al 1999) • Generally, medical schools can do a much better job (Jozewicz et al 2002) • Sharing of (good) test material across institutions is a smart strategy (Van der Vleuten et al 2004). Does Shows how Knows how Knows

  14. Moving from assessing knows1 Knows: What is arterial blood gas analysis most likely to show in patients with cardiogenic shock? A. Hypoxemia with normal pH B. Metabolic acidosis C. Metabolic alkalosis D. Respiratory acidosis E. Respiratory alkalosis 1Case, S. M., & Swanson, D. B. (2002). Constructing written test questions for the basic and clinical sciences. Philadelphia: National Board of Medical Examiners.

  15. To assessing knowing how1 Knowing How: A 74-year-old woman is brought to the emergency department because of crushing chest pain. She is restless, confused, and diaphoretic. On admission, temperature is 36.7 C, blood pressure is 148/78 mm Hg, pulse is 90/min, and resp are 24/min. During the next hour, she becomes increasingly stuporous, blood pressure decreases to 80/40 mm Hg, pulse increases to 120/min, and respirations increase to 40/min. Her skin is cool and clammy. An ECG shows sinus rhythm and 4 mm of ST segment elevation in leads V2 through V6. Arterial blood gas analysis is most likely to show: A. Hypoxemia with normal pH B. Metabolic acidosis C. Metabolic alkalosis D. Respiratory acidosis E. Respiratory alkalosis 1Case, S. M., & Swanson, D. B. (2002). Constructing written test questions for the basic and clinical sciences. Philadelphia: National Board of Medical Examiners.

  16. http://www.nbme.org/publications/item-writing-manual.html

  17. anatomy physiology int medicine item analyses student comments surgery psychology review committee test administration item pool Info to users item bank Post-test review Pre-test review Maastricht item review process

  18. Knows how Assessing knowing how General Lessons learned! • Competence is specific, not generic • Assessment is as good as you are prepared to put into it. Does Shows how Knows how Knows

  19. Knows how Shows how Assessing showing how Does 70-ies: Performance assessment in vitro (OSCE) Shows how Knows how Knows

  20. Key findings around OSCEs1 • Performance on one station poorly predicted performance on another (many OSCEs are unreliable) • Validity depends on the fidelity of the simulation (many OSCEs are testing fragmented skills in isolation) • Global rating scales do well (improved discrimination across expertise groups; better intercase reliabilities; Hodges, 2003) • OSCEs impacted on the learning of students 1Van der Vleuten & Swanson, 1990

  21. Reliabilities across methods Case- Based Short Essay2 0.68 0.73 0.84 0.82 Testing Time in Hours 1 2 4 8 MCQ1 0.62 0.76 0.93 0.93 PMP1 0.36 0.53 0.69 0.82 Oral Exam3 0.50 0.69 0.82 0.90 Long Case4 0.60 0.75 0.86 0.90 OSCE5 0.47 0.64 0.78 0.88 1Norcini et al., 1985 2Stalenhoef-Halling et al., 1990 3Swanson, 1987 4Wass et al., 2001 5Petrusa, 2002

  22. Checklist or rating scale reliability in OSCE1 1Van Luijk & van der Vleuten, 1990

  23. Shows how Assessing showing how Specific Lessons learned! OSCE-ology (patient training, checklist writing, standard setting, etc.; Petrusa 2002) OSCEs are not inherently valid nor reliable, that depends on the fidelity of the simulation and the sampling of stations (Van der Vleuten & Swanson, 1990). Does Shows how Knows how Knows

  24. Shows how Assessing showing how General Lessons learned! Objectivity is not the same as reliability (Van der Vleuten, Norman, De Graaff, 1991) Subjective expert judgment has incremental value (Van der Vleuten & Schuwirth, Eva in prep) Sampling across content and jugdes/examiners is eminently important Assessment drives learning. Does Shows how Knows how Knows

  25. Does Shows how Assessing does 90-ies: Performance assessment in vivo by judging work samples (Mini-CEX, CBD, MSF, DOPS, Portfolio) Does Shows how Knows how Knows

  26. Key findings assessing does • Ongoing work; this is where we currently are • Reliable findings point to feasible sampling (8-10 judgments seems to be the magical number; Williams et al 2003) • Scores tend to be inflated (Dudek et al., 2005) • Qualitative/narrative information is (more) useful (Sargeant et al 2010) • Lots of work still needs to be done • How (much) to sample across instruments? • How to aggregate information?

  27. Reliabilities across methods Mini CEX6 0.73 0.84 0.92 0.96 Case- Based Short Essay2 0.68 0.73 0.84 0.82 Practice Video Assess- ment7 0.62 0.76 0.93 0.93 In- cognito SPs8 0.61 0.76 0.92 0.93 Testing Time in Hours 1 2 4 8 MCQ1 0.62 0.76 0.93 0.93 PMP1 0.36 0.53 0.69 0.82 Oral Exam3 0.50 0.69 0.82 0.90 Long Case4 0.60 0.75 0.86 0.90 OSCE5 0.47 0.64 0.78 0.88 1Norcini et al., 1985 2Stalenhoef-Halling et al., 1990 3Swanson, 1987 4Wass et al., 2001 5Petrusa, 2002 6Norcini et al., 1999 7Ram et al., 1999 8Gorter, 2002

  28. Does Assessing does Specific Lessons learned! Reliable sampling is possible Qualitative information carries a lot of weight Assessment impacts on work-based learning (more feedback, more reflection…) Validity strongly depends on the users of these instruments and therefore on the quality of implementation. Does Shows how Knows how Knows

  29. Does Assessing does General Lessons learned! Work-based assessment cannot replace standardised assessment (yet), or, no single measure can do it all (Tooke report, UK) Validity strongly depends on the implementation of the assessment (Govaerts et 2007) But, there is a definite place for (more subjective) expert judgment (Van der Vleuten et al., 2010). Does Shows how Knows how Knows

  30. Competency/outcome categorizations • CanMeds roles • Medical expert • Communicator • Collaborator • Manager • Health advocate • Scholar • Professional • ACGME competencies • Medical knowledge • Patient care • Practice-based learning & improvement • Interpersonal and communication skills • Professionalism • Systems-based practice

  31. “Domain independent” skills Measuring the unmeasurable Does Shows how Knows how Knows “Domain specific” skills

  32. Measuring the unmeasurable • Importance of domain-independent skills • If things go wrong in practice, these skills are often involved (Papadakis et 2005; 2008) • Success in labour market is associated with these skills (Meng 2006) • Practice performance is related to school performance (Padakis et al 2004).

  33. “Domain independent” skills Measuring the unmeasurable Assessment (mostly in vivo) heavily relying on expert judgment and qualitative information Does Shows how Knows how Knows “Domain specific” skills

  34. Measuring the unmeasurable • Self assessment • Peer assessment • Co-assessment (combined self, peer, teacher assessment) • Multisource feedback • Log book/diary • Learning process simulations/evaluations • Product-evaluations • Portfolio assessment

  35. Eva, K. W., & Regehr, G. (2005). Self-assessment in the health professions: a reformulation and research agenda. Acad Med, 80(10 Suppl), S46-54.

  36. Falchikov, N., & Goldfinch, J. (2000). Student peer assessment in higher education: A meta-analysis comparing peer and teacher marks. Review of Educational Research, 70(3), 287-322.

  37. Driessen, E., van Tartwijk, J., van der Vleuten, C., & Wass, V. (2007). Portfolios in medical education: why do they meet with mixed success? A systematic review. Med Educ, 41(12), 1224-1233.

  38. General lessons learned • Competence is specific, not generic • Assessment is as good as you are prepared to put into it • Objectivity is not the same as reliability • Subjective expert judgment has incremental value • Sampling across content and judges/examiners is eminently important • Assessment drives learning • No single measure can do it all • Validity strongly depends on the implementation of the assessment

  39. Practical implications • Competence is specific, not generic • One measure is no measure • Increase sampling (across content, examiners, patients…) within measures • Combine information across measures and across time • Be aware of (sizable) false positive and negative decisions • Build safeguards in examination regulations.

  40. Practical implications • No single measure can do it all • Use a cocktail of methods across the competency pyramid • Arrange methods in a programme of assessment • Any method may have utility (including the ‘old’ assessment methods depending on its utility within the programme) • Compromises on the quality of methods should be made in light of its function in the programme • Compare assessment design with curriculum design • Responsible people/committee(s) • Use an overarching structure • Involve your stakeholders • Implement, monitor and change (assessment programmes ‘wear out’)

  41. Practical implications • Validity strongly depends on the implementation of the assessment • Pay special attention to implementation (good educational ideas often fail due to implementation problems) • Involve your stakeholders in the design of the assessment • Many naive ideas exist around assessment; train and educate your staff and students.

  42. Overview of presentation • Where is education going? • Where are we with assessment? • Where are we going with assessment? • Conclusions

  43. Areas of development and research • Understanding expert judgment

  44. Understanding human judgment • How does the mind work of expert judges? • How is it influenced? • Link between clinical expertise and judgment expertise? • Clash between psychology literature on expert judgment and psychometric research.

  45. Areas of development and research • Understanding expert judgment • Building non-psychometric rigour into assessment

  46. Qualitative methodology as an inspiration Strategies for establishing trustworthiness: • Prolonged engagement •Triangulation •Peer examination •Member checking •Structural coherence •Time sampling •Stepwise replication •Dependability audit •Thick description •Confirmability audit Procedural measures and safeguards: • Assessor training & benchmarking • Appeal procedures •Triangulation across sources, saturation •Assessor panels •Intermediate feedback cycles •Decision justification • Moderation • Scoring rubrics •………. Quantitative Qualitative Criterion approach approach Truth value Internal validity Credibility Applicability External validity Transferability Consistency Reliability Dependability Neutrality Objectivity Confirmability

  47. Driessen, E. W., Van der Vleuten, C. P. M., Schuwirth, L. W. T., Van Tartwijk, J., & Vermunt, J. D. (2005). The use of qualitative research criteria for portfolio assessment as an alternative to reliability evaluation: a case study. Medical Education, 39(2), 214-220.

  48. Areas of development and research • Understanding expert judgment • Building non-psychometric rigour into assessment • Construction and governance of assessment programmes • Design guidelines (Dijkstra et al., 2009) • Theoretical model for design (Van der Vleuten et al., under editorial review)

  49. Assessment programmes • How to design assessment programmes? • Strategies for governance (implementation, quality assurance)? • How to aggregate information for decision making? When is enough enough?

  50. Areas of development and research • Understanding expert judgment • Building non-psychometric rigour into assessment • Construction and governance of assessment programmes • Understanding and using assessment impacting learning

More Related