Teacher-led assessment in language education: weighing the costs and benefits

Teacher-led assessment in language education: weighing the costs and benefits Geoff Brindley Macquarie University, Sydney, Australia

The move away from standardized tests:some examples • Australia -School-based assessment (SBA) component in most state systems -External testing abolished in 1982 in Queensland • Hong Kong -Move from norm-referenced to standards based assessment -Incorporation of SB oral assessment in HKCEE • UK -Teacher assessment to be used for statutory reporting at 11 and 14 in Wales • US

The move away from standardized tests (contd) • UK -Teacher assessment given greater importance following review of National Curriculum Assessment -Teacher assessment to be used for reporting from 2007 in Wales -Government “Excellence and Enjoyment” report (DfES 2003) endorses greater role for teachers in assessment - Tomlinson Report (2002) endorses use of TA for national reporting

A new role for teacher assessment? “The key to this is to use external scrutiny not principally to mark and grade the performance of individuals but to maintain the quality and professionalism of teachers’ own judgements.” Mike Tomlinson, 18 October 2004

A new role for teacher assessment? • USA -Widespread adoption of “authentic” assessment in schools & adult contexts -Use of AA in some high-stakes contexts (eg Kentucky, Nebraska)

Formative assessment • … formative assessments are always made in relation to where the pupils are in their learning in terms of specific content or skills. To this extent, formative assessment is, by definition, criterion-referenced. At the same time, it may also be pupil-referenced (or ipsative). This means that a judgement of a pupil’s work or progress takes into account such things as the effort put in, the particular context of the pupil’s work and the progress the pupil has made over time (Harlen and James 1998)

Teacher-led formative assessment: some examples • Observation and feedback on performance • Portfolios • Teacher-developed tasks • Project work • Conferences

Formative assessment: the case for The evidence from studies so far indicates that significant learning gains lie within our grasp: it is shown conclusively that formative assessment does improve learning. The gains in achievement appear to be quite considerable, and among the largest ever reported for educational interventions. As an illustration of just how big the gains are, an effect size of 0.7, if it could be achieved on a nation-wide scale, would be equivalent to raising the mathematics attainment score of an 'average' country like England, New Zealand or the USA into the 'top five' after the Pacific Rim countries (Black and Wiliam 1998)

Formative assessment: the case for(contd): effects of feedback Research on feedback • 131 studies of feedback examined • In 40% of studies feedback made performance worse • These studies involved “ego-involving” feedback (focused on the person rather than the quality of the work) eg. comparison with others, grades • Studies where feedback had positive impact used “task-involving” feedback focused on what to improve and how to improve it

Formative assessment: the case for (contd) It is the holistic and collaborative nature of the school-based assessment initiative that ultimately makes it more trustworthy than traditional testing. Like qualitative research, SBA builds into its actual design the capacity for triangulation, the collection of multiple sources and types of evidence under naturalistic conditions over a lengthy period of time. Ultimately, these features of SBA are the key to the criteria by which SBA should be judged (Davison, 2007)

Official discourse of formative assessment • Effective assessment for all pupils should: -recognise what pupils can do and reward achievement -be based on different kinds of evidence -be a valid reflection of what has been taught or covered in class -be reliable in terms of enabling someone else to repeat the assessment and obtain comparable results -be manageable, both in terms of the time needed to complete the task, and in providing results which can be reported or passed on to other teachers (DfES, 2003:2)

Formative assessment: rhetoric or reality? • The notions embedded within ‘reliability’, ‘repeat the task’ suggest a standalone assessment activity. • This positions the teacher within a ‘testing’ paradigm without due attention to ongoing teacher assessment embedded within instruction. (Leung and Rea-Dickins, 2007)

Formative assessment: rhetoric or reality? No reference to: • monitoring learners’ emerging language awareness and development: achievement is highlighted • assessment as integrated within instructional discourse. “Taught or covered” suggests one-off measurement focused assessment • formative assessment as an on-going process Leung and Rea-Dickins, 2007)

Problems with teacher assessments: observation • We have identified a number of potential threats to the dependability of data. eg. at the level of transcription, interpretation of pupil language samples, the nature of the assessment activity and the opportunities it provides for different types of language to be elicited. • Sources of unreliability therefore, are not exclusive to formal paper-and-pencil tests. The validity of inferences made for individual learners depends on their reliability, and we have shown that this reliability cannot be entirely assumed. Erroneous decisions concerning an individual’s language learning development may also be costly in classroom-based assessment contexts (Rea-Dickins and Gardner 2000: 238).

Problems with teacher assessment: teacher-developed tasks Rater inconsistency The difference in severity between the CSWE raters…means that a candidate's chances of being awarded Competency 10 or 12 would be reduced by approximately 45% if her script were judged by the most severe rater instead of the most lenient (Brindley 2000)

Problems with teacher assessment: teacher-developed tasks (contd) These problems of construct validity are exacerbated by the fact that under normal conditions it is unlikely that teachers would have time to develop a test of sufficient length to meet minimum standards of reliability, even for low stakes assessment (Brindley and Slatyer 2002)

Obstacles to adoption of teacher-led assessment • Popular beliefs about testing • Political pressures • Resource issues (time, workloads, $$) • Over-enthusiasm of proponents

Popular beliefs about testing • There is a test for every population/purpose • Norm-referenced tests are a fair and objective measure of student ability • Standardized testing raises standards • Tests have pass marks (usually 50%) • Teachers can’t be trusted to do their own assessment

The critics • Formative assessment also embraces a developmental approach to learning, based on the argument that "students develop and learn at different rates and in different ways"… • The result? Instead of pass or fail, student progress or lack of progress is clouded by such politically correct terms as beginning, established, consolidating or emerging, solid, comprehensive. • Instead of students facing regular examinations with consequences for failure, as do those students in stronger performing education systems overseas, students are automatically promoted from year to year, even though many have not mastered the basics (Donnelly 2005)

The politicians The reports I saw allowed for the teacher to assess students from a range of choices – “usually, consolidating, sometimes and not yet”. What kind of nonsense is this? The educational “experts” with whom I seem to be in constant battle, give me the constant refrain of “outcomes assessment”. The ranking of students against one another is opposed by teacher advocates. Try telling that to parents. Worse still, what do they think happens in the real world? (Nelson 2005)

Resource issues Practical issues and concerns included access to appropriate assessment resources, activities and techniques as models/resources, concerns about the recordings expected, lack of recognition at the school level, the adequacy of training and…lack of time and competing priorities (Davison, 2007)

Overenthusiasm of proponents I was left with the impression that that the assessment programs are relatively problem free and that all students in these schools thrive under these programs. The two chapters had the unsettling feel of “info-mercials” rather than case studies. …some of the case studies seemed to provide a more balanced picture of a particular program than others. While reading certain case studies, I found myself questioning whether the authors had chosen to interview and observe an adequate sample of staff, students, and parents from a school in order to gather sufficient information to portray an assessment program realistically (Myford 1996)

The next chapter • Need for educational assessment community to promote “assessment literacy” for all stakeholders; need for more meaningful dialogue • Need to investigate assessment as interaction: interfaces with SLA research on effects of feedback • Need for more empirical research into (beneficial) effects of teacher-led formative assessment

Epilogue While we favour more teacher assessment, appropriately and rigorously quality assured, no such change should be contemplated unless two requirements can be met: • The new system must be at least as dependable and credible as that it replaces; and • The workload for teachers and lecturers should not increase Until and unless these requirements are met, we do not believe that major change in assessment can be contemplated.” Mike Tomlinson, Press conference 18 October 2004

Teacher-led assessment in language education: weighing the costs and benefits