1 / 58

Formative and summative classroom assessment: Where we are and where we might go Dylan Wiliam, UCL

Dive into the realms of formative and summative assessments in the classroom setting, exploring research strands, critiques, evaluation practices, and the importance of student voice.

devond
Download Presentation

Formative and summative classroom assessment: Where we are and where we might go Dylan Wiliam, UCL

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Formative and summative classroom assessment:Where we are and where we might goDylan Wiliam, UCL NCME Special Conference on Classroom Assessment: Assessment in the Disciplines, October 2018

  2. Outline • Three strands of classroom assessment research • Formative assessment: Critiques and responses • Putting the pieces back together

  3. Three important strands • Classroom evaluation • Teaching as a contingent activity • Student voice

  4. I. Classroom assessment • Classroom assessment is not (necessarily) formative assessment (and vice-versa) • Location • Relation to instruction (synchronous vs. asynchronous) • Purpose (instructional guidance, evaluation) • Environment, resources and conditions • Authority (teacher, peer, learner, other) • Agent (teacher, peer, learner) • Subject (individuals, groups, class) • Assessor (teacher, peer, learner, machine) Black and Wiliam (2004)

  5. Research on classroom evaluation processes • “The impact of evaluation processes on students” (Natriello, 1987) • “The impact of classroom evaluation practices on students” (Crooks, 1988) “For the purpose of this review, classroom evaluation is defined as evaluation based on activities that students undertake as an integral part of the educational programs in which they are enrolled.” (p. 437) • inside and outside the classroom • curriculum-embedded and terminal tests • teacher designed and “off-the-shelf” tests • adjunct questions and other exercises in learning materials • oral questions

  6. II. Teaching as a contingent activity • Individualized instructional systems • 1912: The Individual System (Burke) • 1919: The Winnetka Plan (Washburne) • 1919: The Dalton Plan (Parkhurst) • 1925: Teaching machines (Pressey) • … • 1966: Learning for mastery (Bloom) “In fact, we may even insist that our educational efforts have been unsuccessful to the extent to which our distribution of achievement approximates the normal distribution.” (Bloom, 1968)

  7. Evaluation: summative and formative (Cronbach) “But to call in the evaluator only upon the completion of course development, to confirm what has been done, is to offer him a menial role and to make meager use of his services. To be influential in course improvement, evidence must become available midway in curriculum development, not in the home stretch, when the developer is naturally reluctant to tear open a supposedly finished body of materials and techniques. Evaluation, used to improve the course while it is still fluid, contributes more to improvement of education than evaluation used to appraise a product already placed on the market.” (Cronbach, 1963)

  8. Evaluation: summative and formative (Scriven) “And there are many contexts in which calling in an evaluator to perform a final evaluation of the project or person is an act of proper recognition of responsibility to the person, product, or taxpayers. It therefore seems a little excessive to refer to this as simply ‘a menial role’, as Cronbach does.” “It is obviously a great service if this kind of terminal evaluation (we might call it summative as opposed to formative evaluation) can demonstrate that an expensive textbook is not significantly better than the competition, or that it is enormously better than any competitor.” (Scriven, 1963, emphasis in original)

  9. Summative and formative evaluation (Bloom) “Much of what we have been discussion in the section on the effects of examinations has been concerned with what may be termed “summative evaluation.” This is the evaluation which is used at the end of a course, term, or educational program. Although the procedures for such evaluation may have a profound effect on the learning and instruction, much of this effect may be in anticipation of the examination or as a short- or long-term consequence of the examination after it has been given.”

  10. Teaching as a contingent activity “Quite in contrast is the use of “formative evaluation” to provide feedback and correctives at each stage in the teaching-learning process. By formative evaluation we mean evaluation by brief tests used by teachers and students as aids in the learning process. While such tests may be graded and used as part of the judging and classificatory function of evaluation, we see much more effective use of formative evaluation if it is separated from the grading process and used primarily as an aid to teaching.” (Bloom, 1969, pp. 47-48)

  11. III. The student’s voice: Records of achievement • School leaving examinations in England • Top 20%: General Certificate of Education (1951-1987) • Next 40%: Certificate of Secondary Education (1965-1987) • “Half our future” (Newsom Report, 1963) “Boys and girls who stay at school until they are 16 may reasonably look for some record of achievement when they leave. Some form of leaver's certificate which combined assessment with a record of the pupil's school career would be valued by parents, future employers and colleges of further education and should, we believe, be available to all pupils who complete a full secondary course.” (p. 80)

  12. Research synthesis: Configurative and aggregative Idealist Realist Philosophy Generate Explore Test Relation to theory Configurating Aggregating Approach to synthesis Iterative A priori Methods Quality assessment Theoretical search Exhaustive search Value contribution Avoid bias Emergent concepts Magnitude/precision Product Enlightenment Instrumental Use Gough (2012)

  13. Where should our efforts be focused? Which of these is most strongly associated with high student achievement? • Student speaks the language of instruction at home • Student behavior in the school is good • The amount of inquiry-based instruction • The amount of teacher-directed instruction • The school’s socio-economic profile Top 3 factors • Student’s socio-economic profile • Index of adaptive instruction • The amount of teacher-directed instruction OECD (2016, Fig II.7.2)

  14. “More research is (always) needed…” “Furthermore, despite the existence of some marginal and even negative results, the range of conditions and contexts under which studies have shown that gains can be achieved must indicate that the principles that underlie achievement of substantial improvements in learning are robust. Significant gains can be achieved by many different routes, and initiatives here are not likely to fail through neglect of delicate and subtle features.” (Black & Wiliam, 1998 pp. 61-62)

  15. Critiques and responses

  16. Formative assessment: A critical review Bennett (2011) The definitional issue The domain-dependency issue The effectiveness issue The measurement issue The professional development issue The system issue

  17. The definitional issue Need for clear definitions • So that research outcomes are commensurable • To communicate effectively Theorization and definition • Theorizing what? • Prescriptive: formative assessment as we would like it to be • in terms of what students should learn • in terms of what happens when learning takes place • in terms of how instruction should be organized • in terms of how teachers should teach • Descriptive: formative assessment as it is

  18. Theorization and definition Possible variables • Category (instruments, outcomes, functions) • Beneficiaries (teachers, learners) • Timescale (months, weeks, days, hours, minutes) • Consequences (outcomes, instruction, decisions) • Theory of action (what gets formed?)

  19. Formative Assessment: A contested term Long-cycle Medium-cycle Short-cycle Across terms, teaching units Within and between teaching units Within and between lessons Span Four weeks to one year One to four weeks Minute-by-minute and day-by-day Length Monitoring, curriculum alignment • Student-involved assessment Engagement, responsiveness Impact

  20. Assessment for learning (Mittler, 1973) “Assessment for learning is any assessment for which the first priority in its design and practice is to serve the purpose of promoting pupils’ learning. It thus differs from assessment designed primarily to serve the purposes of accountability, or of ranking, or of certifying competence. An assessment activity can help learning if it provides information to be used as feedback, by teachers, and by their pupils, in assessing themselves and each other, to modify the teaching and learning activities in which they are engaged. Such assessment becomes ‘formative assessment’ when the evidence is actually used to adapt the teaching work to meet learning needs.” (Black, Harrison, Lee, Marshall & Wiliam, 2004 p. 2)

  21. How does assessment improve learning?

  22. An inclusive definition of formative assessment An assessment functions formatively to the extent that evidence about student achievement is elicited, interpreted, and used by teachers, learners, or their peers, to make decisions about future instruction that are likely to be better, or better founded, than the decisions that would have been taken in the absence of that evidence. An assessment functions formatively to the extent that evidence about student achievement is elicited, interpreted, and used by teachers, learners, or their peers, to make decisions about future instruction that are likely to be better, or better founded, than the decisions that would have been taken in the absence of that evidence. An assessment functions formatively to the extent that evidence about student achievement is elicited, interpreted, and used by teachers, learners, or their peers, to make decisions about future instruction that are likely to be better, or better founded, than the decisions that would have been taken in the absence of that evidence. An assessment functions formatively to the extent that evidence about student achievement is elicited, interpreted, and used by teachers, learners, or their peers, to make decisions about future instruction that are likely to be better, or better founded, than the decisions that would have been taken in the absence of that evidence. An assessment functions formatively to the extent that evidence about student achievement is elicited, interpreted, and used by teachers, learners, or their peers, to make decisions about future instruction that are likely to be better, or better founded, than the decisions that would have been taken in the absence of that evidence. An assessment functions formatively to the extent that evidence about student achievement is elicited, interpreted, and used by teachers, learners, or their peers, to make decisions about future instruction that are likely to be better, or better founded, than the decisions that would have been taken in the absence of that evidence.

  23. The domain-dependency issue

  24. Both/and rather than either/or Domain dependency • A theoretical stance or an empirical question? Trade-offs • Domain dependent • Questions, feedback • Domain independent • Strategies, techniques Formative assessment is, trivially, both domain dependent and domain-independent Key question: How far can we take formative assessment as a domain-independent process?

  25. The effectiveness issue Problems with meta-analysis • The “file drawer” problem • Variations in intervention quality • Selection of studies • Variation in population variability • Sensitivity of outcome measures

  26. Annual growth in achievement, by age A 50% increase in the rate of learning for six-year-olds is equivalent to an effect size of 0.76 A 50% increase in the rate of learning for 15-year-olds is equivalent to an effect size of 0.1 Bloom, Hill, Black, andLipsey(2008)

  27. Recent meta-analytic findings Mean effect size ≈ 0.20 All but 2 of these effect sizes involved students over the age of 10 A big effect size Equivalent to a 50% to 70% increase in the rate of learning Kingston and Nash (2011, 2015)

  28. The measurement issue Formative assessment as assessment • An assessment is a procedure for making inferences (Cronbach, 1971 p. 447) • We give students things to do • We observe their responses • We collect evidence • We make inferences • The terms “formative” and “summative” are best thought of as descriptions of inferences

  29. Data and evidence “Evidence is data related to a claim” (Wainer, 2011 p. 148) • Some radical shifts

  30. The professional development issue

  31. Implementation issues Articulation with other policy priorities • Teacher evaluation frameworks (Marzano, Danielson) • Differentiated instruction • Response to (instruction and) intervention Policy environment • Teacher pre-service education • Commitment to continuous improvement • District-level policies that require all teachers to improve • Improvement focused on evidence-based practices • Focus

  32. Unpacking Formative Assessment Clarifying, sharing, and understanding learning intentions and success criteria Eliciting evidence of learning Providing feedback that moves learners forward Activating students as learning resources for one another Activating students as owners of their own learning

  33. The relationship of formative assessment to other policy priorities

  34. Educational Endowment Foundation toolkit (1)

  35. Educational Endowment Foundation toolkit (2)

  36. Educational Endowment Foundation toolkit

  37. Unpacking Formative Assessment Clarifying, sharing, and understanding learning intentions and success criteria Eliciting evidence of learning Providing feedback that moves learners forward Activating students as learning resources for one another Activating students as owners of their own learning

  38. The system issue: Embedding formative assessment Whole-school 2-year PD programme Focus on five strategies of formative assessment • clarifying, sharing and understanding learning intentions • eliciting evidence of achievement • feedback that moves learning forward • activating students as learning resources for one another • activating students as owners of their own learning Detailed resource packs for groups of 8 to 14 teachers • 18 monthly Teacher Learning Community (TLC) meetings • Peer observations between meetings

  39. A “signature pedagogy” for teacher learning Every monthly TLC meeting follows the same structure • Introduction (5 minutes) • Starter activity (5 minutes) • Feedback (25–50 minutes) • New learning about formative assessment (20–40 minutes) • Personal action planning (15 minutes) • Review of learning (5 minutes)

  40. Evaluation “Intention to treat” design • Detect an effect size of 0.2 with 80% power Participants • 140 schools recruited (70 treatment, 70 control) • Excluding those with previous involvement in similar work • 58 treatment, 66 control • 22,709 students in year 10 (age 15+) in Sep 2015 Outcome measure • “Attainment 8” • Average score on externally set exams in 8 subjects • Taken in May 2017 (i.e., 5/6 of the way through the school year)

  41. English literature (Macbeth) Read the following extract from Act 1 Scene 5 of Macbeth and then answer the question that follows. At this point in the play Lady Macbeth is speaking. She has just received the news that King Duncan will be spending the night at her castle.

  42. Question (45 minutes) • Starting with this speech, explain how far you think Shakespeare presents Lady Macbeth as a powerful woman. • Write about: • how Shakespeare presents Lady Macbeth in this speech • how Shakespeare presents Lady Macbeth in the play as a whole Assessment and Qualifications Alliance (2014)

  43. History Pearson (2015)

  44. Impact on student achievement Speckesser, Runge, Foliano, Bursnall, Hudson-Sharpe, Rolfe, and Anders (2018)

  45. Cost-benefit analysis Class size reduction (e.g., Tennessee STAR study) • Additional cost: $5,000 per student per year • Benefit: 12% more learning Embedded Formative Assessment • Additional cost: $3 per student per year • Benefit: 25% more learning

  46. Unfinished business Links with • Pedagogy (Black & Wiliam, 2018) • Instructional design • Learning versus performance • Cognitive load theory

  47. The challenge: Making classroom formative assessment cohere with all the other kinds of assessment going on in a school

  48. Before we can assess… • The ‘backward design’ of an assessment system • Where do we want our students to get to? • ‘Big ideas’ • What are the ways they can get there? • Learning progressions • “Degree of difficulty” • “Marks for style” • Support model • When should we check on/report progress? • Inherent checkpoints • “Troublesome knowledge” • Useful checkpoints • Key transitions

  49. “All models are wrong, some are useful” (Box, 1976) “That’s another thing we’ve learned from your Nation,” said Mein Herr, “map-making. But we’ve carried it much further than you. What do you consider the largest map that would be really useful?” “About six inches to the mile.” “Only six inches!” exclaimed Mein Herr. “We very soon got to six yards to the mile. Then we tried a hundred yards to the mile. And then came the grandest idea of all! We actually made a map of the country, on the scale of a mile to the mile!” “Have you used it much?” I enquired. “It has never been spread out, yet,” said Mein Herr: “the farmers objected: they said it would cover the whole country, and shut out the sunlight! So we now use the country itself, as its own map, and I assure you it does nearly as well.” (Carroll, 1893)

  50. Mapping out the terrain Timescale High-stakes accountability Academic promotion Annually End-of-course exams Quarterly Growth measures Benchmarks Monthly Common assessments End-of-unit tests Before the end-of-unit tests Weekly Graded work Daily Exit pass Hinge-point questions Hourly Instructional Guidance (“formative”) Describing Individuals (“summative”) Institutional Accountability (“evaluative”) Function

More Related