360 likes | 599 Views
Investigating the impact of language assessment systems within a state educational context. Nick Saville Bridging the gap between theory and practice EALTA Krakow May 2006. Investigating the impact of language assessment systems within a state educational context. Nick Saville
E N D
Investigating the impact of language assessment systems within a state educational context Nick Saville Bridging the gap between theory and practice EALTA Krakow May 2006
Investigating the impact of language assessment systems within a state educational context Nick Saville Bridging the gap between theory and practice EALTA Krakow May 2006
Outline • Background - a personal perspective • 1980’s • Bachman – early 1990s • The literature on washback/impact • early work and recent progress • gaps? where next? • Analysis of three case studies – what can be learnt? • Towards a comprehensive model of impact • Applying the model in a state educational context • the Asset Languages Project
Background • The 1980s – a personal perspective • assessment in Italian universities • entrance exams in Japan • the influence of TOEIC/TOEFL e.g. in Japan/Korea • developing Cambridge exams • Tests affect individuals and society! • How can this be managed better? • What is needed to “do a better job”?
Background – 1987-1990 : Japan Considerations in developing fair tests V Test Practicality? R The art of the possible
Practicality V “Practicality in Language Testing: an educational management model” Test R P • Main argument: test development is a form of educational innovation - and needs to be managed as such • “... achieving a balance between the purpose of the test, its validity for the purpose, the required reliability for the purpose and the constraints imposed by the contextis essentially the task facing the test designer ….” • Saville (1990), University of Reading - based on test development project Japan, 1987-9
Practicality • Aspects of Practicality within a context and educational setting: • Acceptability • Applicability • Availability • Difficulty • Economy • Interpretability • Relevance • Replicability “… a principled approach to Practicality should provide the test designer with the means of approaching test development so that a suitable balance can be achieved without overlooking factors which cause possible solutions to fall down in practice”. V Test R P
Putting the test into context The aim … is not only to encourage good testing practice, but to prevent bad tests being produced .... ... a bad test is not only one with low reliability and dubious validity but also one which has a damaging backwash on the curriculum. Saville 1990:11-13 A logical consequence …. is that ethicality will be achieved as a result .. ……. this is because any test which is produced should be appropriate to the educational context in which it is to be used and the effect on learners and institutions will be a major consideration. V Test R P
Putting the test into context V Test R P
Impact Ripples V Test R P
Impact Ripples Local Impact “micro” level V I Test R P
Impact Ripples Wider Impact (“macro” level) I V I I I Test P R
Usefulness as overall validity U = V + R + I + P Bachman - Cambridge 1990/91
Usefulness as overall validity U = V + R + I + P Bachman and Palmer, 1996 : U = Cv + A + I + R + I + P Developing “useful tests”, fit for purpose Balancing the test qualities
Starting to develop a model • 1993 – 1995 • Using VRIP to develop and revise exams e.g. IELTS 1995 • The IELTS impact project
The literature on washback/impact • Readings in the language testing literature: • Hamp-Lyons (1989) • Wall and Alderson (1993) Does washback exist? Etc.. • Bailey (1996) • Hamp-Lyons (1997) • Watanabe (1997) • Cheng and Watanabe (eds) (2004) • Recent PhD studies and subsequent books based on research conducted in the 1990s: • Cheng (SILT 21 - 2005) • Wall (SILT 23 - 2005) • Green (2004 – SILT forthcoming 2007) • Hawkey – SILT 24 (forthcoming - 2006) • Current work in Lancaster, ETS, UCLA, Cambridge etc.
The literature on washback/impact So • Impact is relatively new in the field of language assessment - an extension on the notion of washback and related to ethicality • It is now considered to be of growing importance • It is part of a validity argument and evidence needs to be provided Broadly speaking there is consensus • impact deals with wider influences and includes the “macro contexts” - tests and examinations in society • washback is an aspect of impact related to the “micro contexts” of the classroom and the school BUT • The dynamics between the micro and macro contexts mean that this is a complex rather than a simple relationship - a “complex dynamic system”
The literature on washback/impact And currently: • there is no comprehensive model of test or examination impact within educational contexts • impact has not yet been fully integrated into an approach to test development and validation in a systematic way
Three case studies – 1995 to 2004 • Case 1 - the world-wide survey of the impact of IELTS • a starting point for the work and the original model for what has followed • a conceptualisation of impact and design/validation of suitable instruments to investigate it • Case 2 - the Italian PL2000 project • an application of the model within a macro educational context • an initial attempt at the applying the approach on a limited basis within a state educational context • Case 3 - the Florence Learning Gains Project • an extension and re-application of the model within in a single school context • at the micro level focusing on individual stakeholders within a single language teaching institution
Learning from the case studies • What can be learned using these specific impact projects as meta-data?
Learning from the case studies • Three key factors of contemporary educational systems need to be accounted for: • the nature of complex dynamic systems • the roles that stakeholders play within such systems • the need to see assessment projects as educational innovations within the systems and to manage change effectively
Hybrid Model of the Diffusion / Implementation Process • 3. the need to see assessment projects as educational innovations and to manage change effectively • See Wall (2005) a case study using insights from testing and innovation theory E.g. Henrichsen (1989) Antecedents Process Consequences
Learning from the case studies • When applied to language assessment – two key factors also need to be accounted for : • the nature of language itself as a socio-cognitive phenomenon (the latest views on validity) • the nature of the test development and validation process • from conception to routine data collection and analysis • Impact research, therefore is no different from any other kind of validation activity........
1. A SOCIO-COGNITIVE FRAMEWORK Messick Bachman Kane Mislevy Weir etc.
A SOCIO-COGNITIVE FRAMEWORK The testing system Construct
The contexts Learning contexts Testing contexts Use of results contexts
Model of the Test Development Process Identifying stakeholders and their needs Linking these needs to the requirements of test usefulness - including predicted impact - theoretical - practical Long term, Iterative Processes - a key feature of validation
Involvement of the stakeholder constituency E.g. during test design and development • presentation and consultation to do with specifications and detailed syllabus designs • professional support programmes for institutions and individual teachers/students etc. who plan to use the examinations • training and employment of suitable personnel within the field to work on all aspects of the examination cycle – to be question/item writers, to act as examiners, etc.
After an examination becomes operational • Procedures also need to be in place to routinely collect data which allows impact to be estimated: • e.g. • who is taking the examination • (i.e. a profile of the candidates) • who is using the examination results and for what purpose • who is teaching towards the examination and under what circumstances • what kinds of courses and materials are being designed and used to prepare candidates • what effect the examination has on public perceptions generally • (e.g. regarding educational standards) • how the examination is viewed by those directly involved in educational processes • (e.g. by students, examination takers, teachers, parents, etc.) • how the examination is viewed by members of society outside education • (e.g. by politicians, business people, etc.)
Towards a comprehensive model • How can these considerations be combined to produce a comprehensive, integrated model?
Next phase: applying the model • Asset Languages within the UK educational context
Contacts: www.assetlanguages.org.uk saville.n@cambridgeESOL.org www.cambridgeesol.org/rs_notes