420 likes | 435 Views
The future of assessment. Tim Oates CBE | Group Director of ARD | January 2019. It’s all about technology…. …technological development is only a small part of well-evidenced, coherent development of assessment…. It’s all about technology….
E N D
The future of assessment • Tim Oates CBE | Group Director of ARD |January 2019
It’s all about technology… • …technological development is only a small part of well-evidenced, coherent development of assessment…
It’s all about technology… • On-screen marking – remote standardisation and moderation • Marker monitoring • On-screen assessment • Automated marking • E-portfolios • Item banking • Curation of all our questions • On-demand testing – versioning, time zones • Adaptive testing and automated test construction • Rapid feedback • Results analysis – us, schools, candidates • Test evaluation and development • Experimental work – rank ordering, on-screen marking environment • Formative assessment environments and assessment embedded in learning
It’s all about technology…no…it’s still about these things • Learners • Users • The content of learning • The accuracy and fairness of measurement • Access to assessment • Reporting • Impact
Assessment….Why….Exactly? • To understand what is happening in a young person’s mind • Production
Production • The externalisation of thinking • Making thoughts an object of study for the pupil • Revealing pupils’ thinking to teachers
WOWS Project (With Others We Succeed) – a focus on marking practices • 1 no meaningless summarisation – no ‘levels’ – a construct focus • 2 immediate feedback and action • 3 production – a focus on pupil work • 4 effective assessment is more than marking • 5 meaningful, manageable and motivational • 6 parental understanding – all actors agreeing on approach and action • Report soon to be available
Validity • Using an elaborated ‘consequentialist’ model • Uses - impact
Working upwards from the construct • Assessment in service of curriculum aims • Evidence on key constructs
Criteria relating to assessment – ‘Cambridge Approach’ • Attending to the PURPOSES OF ASSESSMENT • Reliable • consistent measurement • Valid • measures precisely what it claims to measure • Sound construct base • measures something consistent with curriculum aims • Consequential validity • the uses to which the assessment is put are technically and ethically sound • Beneficial impact • the full range of effects are beneficial • Utility • cost, resource
In assessment, the concept of ‘construct’is vital • Can multiply two three digit numbers • Understands and is inventive with metaphor • Reads a wide range of books for pleasure • Understands diffusion across a membrane • Can understand and use familiar everyday expressions and very basic phrases aimed at the satisfaction of needs of a concrete type • Understand the concept of percentage and calculate pc • Use the concept of inequality to analyse social relations • Understands conservation of mass • Measures accurately to quantify oxidation • Has successfully converted to Alouette III • Is intellectually well prepared for pre-clinical medical education • Has a certain level of verbal reasoning • Manifests externalisingbehaviour
What about ‘21st Century Skills’? • 21st Century skills: Ancient, ubiquitous, enigmatic? • Irenka Suto Research Division Cambridge Assessment • Paper published in January 2013 in • Research Matters: A Cambridge Assessment Publication • Contact details: • Dr Irenka Suto • Principal Research Officer • Research Division • Cambridge Assessment • 1 Hills Rd, Cambridge, CB1 2EU • E-mail: suto.i@cambridgeassessment.org.uk
Some false dichotomies • Formative assessment Low stakes (lower quality) • Summative assessment High stakes (high quality)
Many are reluctant to ask questions out of fear of failure • An ice cube floats in a glass of water • Will the level of water in the glass rise when the cube melts?
Around the world Singapore: examination of impact of PSLE and tracking Portugal: retrenchment after beneficial introduction of national tests England: introduction of carefully designed diagnostic testing and revisions to accountability Sweden: concern to re-establish clear sense of national standards in the system Iceland: national reading tests being developed and trialled Finland: continued high levels of use of tests in primary, with local accountability arrangements
Case 1: Sweden – 1995-2018 • 1995 Sweden the top performer in TIMSS Advanced • 2008 scores in mathematics and physics fell sharply • Crisis in maths proficiency of entrants to HE compared to 1970 levels • 2015 Swedish Commission to examine means of improving educational quality • 2015 improved results in PISA • 1990 right to run schools transferred to municipalities • 1992 introduction of voucher programme and formation of new schools • National Curriculum elaborate but capable of variable interpretation • High dependence on teacher assessment – A-F grading variably applied • Underdeveloped strategy regarding textbooks and learning resources
file:///C:/Users/batemp/Downloads/2017%20Henrekson-Jävervall%20IVA%20report%20in%20English%20Jan%202017.pdffile:///C:/Users/batemp/Downloads/2017%20Henrekson-Jävervall%20IVA%20report%20in%20English%20Jan%202017.pdf
Analysis and remedy • Curriculum specification revised in 2011; adequate but poorly linked to other instruments • Not all instruments for control of standards were in place • Competitive forces drove creation of schools but adversely affected standards • Inadequate monitoring of innovation in order to allow refinement of policy • Mandatory tests years 3, 6 and 9 – teacher assignment of grades • Now • Recognition of importance of high quality learning materials; work with educational publishers • Tightening of system scrutiny – school monitoring and random checking of tests • Inspection frequent (municipality and free schools every three years) but under review
Case 2: Finland 1965-2018 • Finland • Full system reform – pedagogic and curriculum content • First phase • From 1968, fundamental reform based on fully comprehensive model, highly centralised, heavy State involvement. Revision of teacher training, grade tests, State-approved textbooks, heavy school inspection. • Second phase • Strategic move to higher institutional autonomy, office for textbook approval closed in early 90s, inspection eased, data submission on school performance continued – phase culminated in superlative performance in PISA 2000 • Third phase • Decay in attainment, large programme of school closure, urban choice issues, introduction of project-based cross-curriculum learning (20pc) • Throughout, Abitur fundamentally unchanged.
The importance of the right kinds of assessment In Dunlosky, J., Rawson, K. A., Marsh, E. J., Nathan, M. J., & Willingham, D. T. (2013). Improving students’ learning with effective learning techniques: Promising directions from cognitive and educational psychology. Psychological Science in the Public Interest, 14(1), 4-58.
From Bert Johnson presentation on ‘Test-Enhanced Learning: Evidence from behavioural and brain imaging studies’
England KS3 tests 2008 abolition • Sats for 14-year-olds are scrapped • Pupils will no longer have to sit externally marked tests at 14; American-style report cards to be produced for schools • Polly CurtisGuardian Tue 14 Oct 2008 15.40 BST • The government is to abolish Sats for 14-year-olds in a historic move triggered by the collapse of this year's marking process and a string of high-profile critical reports on the tests. The changes mean pupils will no longer have to sit externally marked tests at the age of 14, but ministers have insisted that primary school pupils will still have to undergo the most controversial tests at 11. • The schools secretary, Ed Balls, today informed parliament of plans for sweeping changes to the national testing system, which sees 1.2 million pupils sit 9.5m papers every year.
China https://qz.com/1075162/are-chinese-schools-better-at-teaching-kids/
China http://ncee.org/2016/05/international-spotlight-cieb-visits-teaching-and-research-groups-in-shanghai/
England http://sheredesprimary.herts.sch.uk/
120 questions • ‘…children...what’s multiplication...’ • Wroxham School, Potter’s Bar 2011 • The most obvious difference between mathematics lessons in England and lessons in Shanghai is the amount of time spent in ‘whole class teaching’ – ie directed by the teacher from the front of the classroom. This includes carefully planned lecturing, but isn’t all one-sided; teachers actually ask the students huge numbers of questions (on average 50-120 per lesson) during this demonstration period, making it highly interactive. Some of the questions are deliberately very easy, so that the teachers start where the chidren are, and build up their explanations and questions, gradually moving to more difficult mathematical concepts. • Lucy Crehan Cleverlands 2016
‘The other important point to emphasise is the feedback. Practising at length is not useful, and can even be harmful, if you’re practising in the wrong way. Chinese teachers make the most of their extra non-teaching time to offer feedback to pupils in three ways. Firstly, they will often mark the students’ classwork and homework on the same day it’s handed in, using a set of symbols to indicate what the students got wrong so the students get immediate feedback. This doesn’t always happen; in some schools I saw students in the staffroom marking their peers’ work using the mark scheme, but this still gives the teacher an idea about distribution of mistakes, which they can use in their planning. Secondly, they discuss common mistakes or misunderstandings at the beginning of the very next lesson, and ask students who got the tricky questions to demonstrate how they did it on the board to the rest of the class. On one occasion a maths teacher was hesitant to let me observe her class because, she said, ‘we’re only going over homework’, yet this is probably where the most learning gains happen’. Lucy Crehan, Cleverlands (p.183), 2016
Locked-in low expectations in ‘personalised learning’ • Pupil A • Age 11 formal assessment Age 16 RAG Higher Tier • Assessment for allocation to lower group route grade prediction • Level 4 Low score 4/5 LOWER TIER • Pupil B • Age 11 formal assessment Age 16 RAG Higher tier • Assessment for allocation to lower group route grade prediction • Level 4 Low score 4/5 LOWER TIER • Pupil C • Age 11 formal assessment Age 16 RAG Higher tier • Assessment for allocation to lower group route grade prediction • Level 4 Low score 4/5LOWER TIER
Trapped by data – ‘self fulfilling prophecy’ • Age 11 Age 13 • Student A – met national standard Level 4 – placed in low set maths – entered for lower tier GCSE - predicted 4-5 in GCSE • Student B – met national standard Level 4 – placed in low set maths – entered for lower tier GCSE - predicted 4-5 in GCSE • (higher tier grades 4-9, lower tier 1-5 – grade 5 = national ‘pass’ grade) • Teacher concepts of progression ‘we may consider allocating a higher predicted grade’ • Poor quality maths provision age 5-11 – impact in data-driven systems • Different motivation and engagement in maths • Different levels of support • High levels of differentiation in programme content higher set and lower sets • Allocation of quality teachers • Different forms of social learning and behaviour in higher and lower sets • Different learning identities
Reading - PIRLS results England • 2001 553 points • 2006 539 points • 2011 552 points 10th place • 2016 559 points 8th place • Closing gap between boys and girls • Pupils did worst if their teacher had between 10 and 20 years’ experience, but scored similarly if their teacher had between five and 10 or over 20, years’ experience.
‘Wrong ways’ and ‘right ways’ • Curriculum narrowing • Narrow and de-motivating extended test preparation • Manufactured test anxiety • ‘Ambush’ assessment • A system flooded with high quality items • Assessment seamlessly wrapped into instruction • Broadening of pupils’ thinking • Using assessment for focussed, immediate support