Nick Saville Bridging the gap between theory and practice EALTA Krakow May 2006

Investigating the impact of language assessment systems within a state educational context Nick Saville Bridging the gap between theory and practice EALTA Krakow May 2006

Outline • Background - a personal perspective • 1980’s • Bachman – early 1990s • The literature on washback/impact • early work and recent progress • gaps? where next? • Analysis of three case studies – what can be learnt? • Towards a comprehensive model of impact • Applying the model in a state educational context • the Asset Languages Project

Background • The 1980s – a personal perspective • assessment in Italian universities • entrance exams in Japan • the influence of TOEIC/TOEFL e.g. in Japan/Korea • developing Cambridge exams • Tests affect individuals and society! • How can this be managed better? • What is needed to “do a better job”?

Background – 1987-1990 : Japan Considerations in developing fair tests V Test Practicality? R The art of the possible

Practicality V “Practicality in Language Testing: an educational management model” Test R P • Main argument: test development is a form of educational innovation - and needs to be managed as such • “... achieving a balance between the purpose of the test, its validity for the purpose, the required reliability for the purpose and the constraints imposed by the contextis essentially the task facing the test designer ….” • Saville (1990), University of Reading - based on test development project Japan, 1987-9

Practicality • Aspects of Practicality within a context and educational setting: • Acceptability • Applicability • Availability • Difficulty • Economy • Interpretability • Relevance • Replicability “… a principled approach to Practicality should provide the test designer with the means of approaching test development so that a suitable balance can be achieved without overlooking factors which cause possible solutions to fall down in practice”. V Test R P

Putting the test into context The aim … is not only to encourage good testing practice, but to prevent bad tests being produced .... ... a bad test is not only one with low reliability and dubious validity but also one which has a damaging backwash on the curriculum. Saville 1990:11-13 A logical consequence …. is that ethicality will be achieved as a result .. ……. this is because any test which is produced should be appropriate to the educational context in which it is to be used and the effect on learners and institutions will be a major consideration. V Test R P

Putting the test into context V Test R P

Impact Ripples V Test R P

Impact Ripples Local Impact “micro” level V I Test R P

Impact Ripples Wider Impact (“macro” level) I V I I I Test P R

Usefulness as overall validity U = V + R + I + P Bachman - Cambridge 1990/91

Usefulness as overall validity U = V + R + I + P Bachman and Palmer, 1996 : U = Cv + A + I + R + I + P Developing “useful tests”, fit for purpose Balancing the test qualities

Starting to develop a model • 1993 – 1995 • Using VRIP to develop and revise exams e.g. IELTS 1995 • The IELTS impact project

The literature on washback/impact • Readings in the language testing literature: • Hamp-Lyons (1989) • Wall and Alderson (1993) Does washback exist? Etc.. • Bailey (1996) • Hamp-Lyons (1997) • Watanabe (1997) • Cheng and Watanabe (eds) (2004) • Recent PhD studies and subsequent books based on research conducted in the 1990s: • Cheng (SILT 21 - 2005) • Wall (SILT 23 - 2005) • Green (2004 – SILT forthcoming 2007) • Hawkey – SILT 24 (forthcoming - 2006) • Current work in Lancaster, ETS, UCLA, Cambridge etc.

The literature on washback/impact So • Impact is relatively new in the field of language assessment - an extension on the notion of washback and related to ethicality • It is now considered to be of growing importance • It is part of a validity argument and evidence needs to be provided Broadly speaking there is consensus • impact deals with wider influences and includes the “macro contexts” - tests and examinations in society • washback is an aspect of impact related to the “micro contexts” of the classroom and the school BUT • The dynamics between the micro and macro contexts mean that this is a complex rather than a simple relationship - a “complex dynamic system”

The literature on washback/impact And currently: • there is no comprehensive model of test or examination impact within educational contexts • impact has not yet been fully integrated into an approach to test development and validation in a systematic way

Three case studies – 1995 to 2004 • Case 1 - the world-wide survey of the impact of IELTS • a starting point for the work and the original model for what has followed • a conceptualisation of impact and design/validation of suitable instruments to investigate it • Case 2 - the Italian PL2000 project • an application of the model within a macro educational context • an initial attempt at the applying the approach on a limited basis within a state educational context • Case 3 - the Florence Learning Gains Project • an extension and re-application of the model within in a single school context • at the micro level focusing on individual stakeholders within a single language teaching institution

Learning from the case studies • What can be learned using these specific impact projects as meta-data?

Learning from the case studies • Three key factors of contemporary educational systems need to be accounted for: • the nature of complex dynamic systems • the roles that stakeholders play within such systems • the need to see assessment projects as educational innovations within the systems and to manage change effectively

the nature of complex dynamic systems

2. the roles that stakeholders play

Hybrid Model of the Diffusion / Implementation Process • 3. the need to see assessment projects as educational innovations and to manage change effectively • See Wall (2005) a case study using insights from testing and innovation theory E.g. Henrichsen (1989) Antecedents Process Consequences

Learning from the case studies • When applied to language assessment – two key factors also need to be accounted for : • the nature of language itself as a socio-cognitive phenomenon (the latest views on validity) • the nature of the test development and validation process • from conception to routine data collection and analysis • Impact research, therefore is no different from any other kind of validation activity........

1. A SOCIO-COGNITIVE FRAMEWORK Messick Bachman Kane Mislevy Weir etc.

A SOCIO-COGNITIVE FRAMEWORK The testing system Construct

The contexts Learning contexts Testing contexts Use of results contexts

Impact

2. Model of the Test Development Process

Model of the Test Development Process Identifying stakeholders and their needs Linking these needs to the requirements of test usefulness - including predicted impact - theoretical - practical Long term, Iterative Processes - a key feature of validation

Involvement of the stakeholder constituency E.g. during test design and development • presentation and consultation to do with specifications and detailed syllabus designs • professional support programmes for institutions and individual teachers/students etc. who plan to use the examinations • training and employment of suitable personnel within the field to work on all aspects of the examination cycle – to be question/item writers, to act as examiners, etc.

After an examination becomes operational • Procedures also need to be in place to routinely collect data which allows impact to be estimated: • e.g. • who is taking the examination • (i.e. a profile of the candidates) • who is using the examination results and for what purpose • who is teaching towards the examination and under what circumstances • what kinds of courses and materials are being designed and used to prepare candidates • what effect the examination has on public perceptions generally • (e.g. regarding educational standards) • how the examination is viewed by those directly involved in educational processes • (e.g. by students, examination takers, teachers, parents, etc.) • how the examination is viewed by members of society outside education • (e.g. by politicians, business people, etc.)

Towards a comprehensive model • How can these considerations be combined to produce a comprehensive, integrated model?

Next phase: applying the model • Asset Languages within the UK educational context

Contacts: www.assetlanguages.org.uk saville.n@cambridgeESOL.org www.cambridgeesol.org/rs_notes

Nick Saville Bridging the gap between theory and practice EALTA Krakow May 2006

Nick Saville Bridging the gap between theory and practice EALTA Krakow May 2006

Presentation Transcript

Bridging the Gap Between Strategy and Results

Valuation: Bridging the Gap Between Academics and Industry Practice

Supervision and the Recovery Agenda: bridging the gap between theory and practice

BRIDGING THE GAP BETWEEN THEORY AND PRACTICE IN MAINTENANCE

Bridging the gap between business and technology

Mental Training- Bridging the Gap Between Theory and Application

LearnLab : Bridging the Gap Between Learning Science and Educational Practice

Bridging the Gap Between Statistics and Engineering

Bridging the theory-practice gap in professional education

Bridging the gap between print and online

Bridging the Gap Between Research and Practice

Bridging the gap between search and IR

Bridging the gap between school and University

Bridging the Gap between Theory and Practice?

BRIDGING THE GAP BETWEEN WRITING AND MATH

Bridging the gap between patient health literacy and professional practice

Curriculum Topic Study- Bridging the Gap Between Standards and Practice

Bridging the Gap Between Research and Practice

Bridging the Gap Between Faith---and --- Work

Bridging the Gap between Cavers and Scientists …

Bridging the Gap Between Technology and Business

BRIDGING THE GAP BETWEEN THEORY AND PRACTICE