International Assessment of Language Competence: (How) Can it be Done?

International Assessment of Language Competence: (How) Can it be Done? Sauli Takala University of Jyväskylä, Finland 7th Intercultural Rhetoric and Discourse Conference Indiana University-Purdue University Indianapolis (IUPUI) August 9-11, 2012

Overview • Why engage in international assess-ments? • Challenges and methodological approaches • International surveys of language competence: some cases • Some main findings • Some future challenges and opportunities

International assessment is a “growth industry”. By the end of 1960s: 2 international surveys 1970s: 10 1980s: 13 1990s: 25 2010s: 33 To date: 36 Why has this happened?

Main actors: IEA (International Asso-ciation for the Evaluation of Educat-ional Achievement), since 1958; first survey 1964 • OECD/PISA (Organisation for Econ-omic Co-operation and Development; Programme for International Student Achievement), since 1997; first survey 2000 • EU/European Union, since 2002 (“Bar-celona Indicator”), first survey 2012

Rationale • Education assumed to mainly promote indivi-dual growth and development. • Increasingly expected to serve social, cultural and economic policies. • Current high policy priority: to enable the nations and their citizens to take full advantage of an increasingly globalized economy. • -> Need of providing high quality and sustainable education. • -> Requirement: an acceptable degree of equity in the distribution of opportunities to learn (OTL) but also clear incentives for achieving greater efficiency in schooling.

Claim: successful educational policy and well-informed planning and implementation depend on indicators showing how well the educational systems are functioning. • National monitoring systems of various kinds (e.g., National Assessment of Educational Progress/NAEP) set up, but… • International yardsticks were also called for: assessments undertaken by an international team/organization: same design, same procedures and insruments

Emergence of a systematic approach: to international assessments/IEA, late 1950s: Yield: theoretical and practical information on patterns of variables related to the levels of achievement across countries. The variation in educational systems was seen to provide a “natural laboratory”, a natural “experimental setting”.Note: not just “educational olympics”, league tables

Prioritization: reading L1/literacy, mathematics, and science. Less attention to : social studies and humanities (civics, literature, foreign languages). Why is this?

The Study of Reading Comprehension(1968-1972). Thorndike, 1973. • The Study of Literature Education (1968–1973). Purves, 1973. • The Study of Frenchas a Foreign Language ( 1968-1973). Carroll, 1975. • The Study of Englishas a Foreign Language (1968-1973) . Lewis & Massad, 1975. • The Study of Written Composition(1983-1989) . Gorman, Purves & Degenhart 1988; Purves, 1992. • The Language Education Study (1993-1996). Dickson & Cumming, 1996. (not completed as originally planned). • Reading LiteracyStudy (1986–1994). Elley ,1992. • PISA: Reading Literacy (OECD), 2003, 2006, 2009…. • PIRLS (IEA), 4th –graders, reading; 2001, 2006 (5-year cycles) • European Survey of Language Competences/ESLC (2012) (European Commission)

Definition of Models/Constructs

From the 1980s onwards, a distinction between • the intended curriculum (systemic level) • the implemented curriculum (instructional level) , and • the realized curriculum (student level) • became an important design feature.

Definition of Models/Constructs Design of IEA Study of Writing

IEA Study of Writing: Domain of Writing

PISA: Reading literacy - students' ability to understand, use and reflect on written text to achieve their purposes. An "active" element — the capacity not just to understand a text but to reflect on it, drawing on one's own thoughts and experiences.

OECD/PISA (2000): Reading literacy is assessed in relation to: • Text format: continuous texts and non-continuous texts. • Reading processes (aspects): proficiency in retrieving information, forming a broad general understanding of the text, interpreting it, reflecting on its contents and reflecting on its form and features

OECD/PISA (2000): Reading literacy is assessed also in relation to: • The situation defined by the use for which the text was constructed. • Eg., a novel, personal letter or biography is written for people's personal use; official documents or announcements for public use; a manual or report for occupational use, educational use … • Desirable to include a range of types of reading in the assessment items.

Translation of Instruments: Concern about Equivalence and Validity

In PISA, high requirements are set for the translators: • Professional translators with a good command of the two source languages and cultures (English or French) • Familiarity with the educational systems and cultures of the countries involved.

Translation process • Double (forward) translation from two parallel/calibrated (En/Fr) source texts • National and international verification (feedback) • Reconciled by a 3rd third translator into one national version • Test booklets sent to the International Project Centre for a final optical check of the layout of the texts. • Verified by still a 4th translator from the International Project Centre

Specific instructions for layout, choice of voca-bulary/syntax, and for avoidance of irrelevant clues. • Guidelines act as advice and stress that cumber-some translations are to be avoided. • Specific translation notes attached to all the texts. • For every question item, it is explained what skill it is intended to measure; to avoid changing the nature of the questions and the strategies required to answer them correctly (enhance equivalence). • Laborious and time-consuming process.

Some Results and Interpretations

John B. Carroll (1975): • General proficiency in French strongly related to verbal ability (score on a word knowledge test in L1. • Higher scores: French used mainly in the class time, use of the mother tongue reduced but not eliminated. • All “four skills”: a strong correlation between country mean score and the average number of years the students had studied French.

Teacher impact: indeterminate (neither the amount of university training nor the amount of travel or residence in a French-speaking country led to any differences in students’ French achievement. (!) • Carroll estimated, among other things, that 5-6 years with 3-4 weekly lessons were required to achieve a satisfactory level of reading comprehension. • What might be the findings today?

The Writing Study: Research tasks • conceptualization of the domain of school-based written composition • develop an internationally appropriate set of writing tasks and a system for assessment • describe recent development and the current state of instruction in the participating countries/school systems (using very extensive curriculum and teacher questionnaires) • identify factors which explain differences and patterns in written composition and other outcomes, with particular attention to cultural background, curriculum, and teaching practices.

Key findings • The construct “written composition” sited in a cultural context; cannot be considered a general cognitive capacity or activity. • Marked variation across the countries in both ideology of the teachers and in instructional practices. Written performance was also found to be task dependent. • Good compositions shared common qualities of handling of content and appropriateness of style, but these qualities had their national or local characteristics in organization, use of detail, and other aspects of rhetoric.

A Retrospect: Thirty Years Later • For most of the participating countries, a new venture. • Brought together a large number of mother tongue specialists from all over the world and helped to establish much needed international networks. • Like in all IEA´s work, participation in its projects is/was hands-on training on how large-scale assessments of writing could be carried out.

It seems to me that the main long-term contribution of the written composition study differs from a “typical” IEA study. • It did not lead to sufficiently reliable and valid comparisons of the level of achievement among the participating countries nor to clear patterns of explanations of the achievements. • Thus it basically failed in one task, but…

I believe that its main contribution was: • development of useful approaches to the assessment of writing - yielding a better conceptualization of the domain. • While a huge challenge, it also was an opportunity for raising awareness about the complexities of such endeavour, • and about the strong cultural and context dependence of the teaching and assess-ment of writing

For me personally: working with Alan Purves; a source of constant inspiration leading to the most productive phase in my career. • The study was accompanied by considerable time pressure and constant worry about the shortage of funding. Life with the project was not easy. Paraphrasing Gilbert & Sullivan: A coordinator´s “ lot is not a happy one”.

However, these problems were more than compensated by learning to know and working with great colleagues all over the world. • Life with the study was sometimes quite exhausting, occasionally quite frustrating but always exciting, interesting and instructive.

IEA: 1992

.Comparative perspective: Finland on top. How-ever, our students did not see themselves as good readers (also Singapore, Hong Kong) unlike students in eg., Greece, USA, Canada (BC) • .Economic and social context: RL closely related to economic development, health, and adult literacy. • Home language: RL lower if home language was different from school language. Singapore a clear exception. • High- and low-scoring countries: importance of books and libraries, frequent silent reading , frequent story reading aloud by teachers.

Urban-rural differences: RL higher among urban children • Voluntary reading: RL influenced by the amount of voluntary out-of-school book reading • Becoming a good reader: Students emphasized such factors as Liking it, Having lots of time to read, and Concentrating well.

European Survey of Language Competences/ESLC (2012: • to provide comparative data on foreign language competence • provide insights into good practice in language learning ( information about language learning, teaching methods and curricula).

14 EU countries: Belgium , Bulgaria, Croatia, Estonia, France, Greece, Malta, Netherlands, Poland, Portugal, Slovenia, Spain, Sweden and UK-England. Altogether 54,000 pupils. • Last year of lower secondary (ISCED2) or the second year of upper secondary education (ISCED3). • Minimum requirement: L2 studied at least one whole school year. • Each country tested the two languages most widely taught (first L2 and second L2); five languages: English, French, German, Italian and Spanish. • Each sampled pupil was tested in one language only.

Unique features • Results related to the levels of the Common European Framework (CEFR) • Standard setting used to establish cutscores Levels: C2 Proficient C1 User B2 Independent B1 User A2 Basic A1 User ”Blue Bible” CUP, 2001

Council of Europe, Strasbourg Cambridge University Press, 2001

Some key findings: positive impact • Early start of FL study • Parents´s knowledge of the language • Exposure to and use of the language through traditional and new media. • Sense of usefulness in learning the language • A greater use of the foreign language in lessons by both teachers and pupils • Like in Carroll´s study of French (1975) various indices of initial and in-service teacher education: little relation to language proficiency. • -> Several policy recommendations provided.

Reviews and criticisms of international assessments • In spite of the growth and the increasing interest in the outcomes by many stakeholders - an undercurrent of critical response. • As expected, the research community has found several grounds for critical views, especially concerning the methodology used and the validity of the findings (eg., translation, opportunity to learn, sampling…).

There has been “hand-wringing”, national challenges (“We´re better than that!!”), and occasionally some drastic policy measures in countries that have done less well than expected, and admiration (and envy?) of the high-achieving countries. • What is the inevitable impact of large-scale, comparative studies, whether perceived as positive or negative? • Impact: Are there any signs of adapting teaching, testing and examinations, and even national curricula, to be aligned with the PISA approach – “teaching to the test” in order to obtain a higher ranking?

Widespread agreement: international assessments are extremely challenging and complex posing questions about validity ranging from construct definition and coverage to interpretation, use and consequences. • Equally important, however, is the fact that viewing the world as an “educational laboratory” or “educat-ional experiment” holds promise for exploring and generating hypotheses, testing them and gaining a better understanding of systemic and cultural effects. • This means that, at best, international assessments can inform policy in positive directions concerning the learning of students as well as teachers and decision-makers and politicians.

Some references Communication from the Commission to the European Parliament and the Council. The European Indicator of Language Competence. CPM(2005)356 final. (21 pages) Carroll, J. B. (1975). The Teaching of French as a Foreign Language in Eight Countries. Stockholm: Almqvist & Wiksell. Dickson, P. & Cumming, A. (1996). Profiles of Language Education in 25 Countries. Slough: National Foundation for Educational Research. Elley, W. B. (1992). How in the world do students read? IEA: The Hague. Lewis, G.E, & Massad, C.E. (1975).The Teaching of English as a Foreign Language in Ten Countries: An Empirical Study. Stockholm: Almqvist & Wiksell. Purves, A. C. (1973). Literature Education in Ten Countries I. Stockholm: Almqvist & Wiksell. Gorman, T. P., Purves, A.C. & Degenhart, E. R. (1988). The IEA Study of Written Composition I: The International Writing Tasks and Scoring Guides. London: Pergamon Press. Purves, A.C. (1992). The IEA Study of Written Composition II: Education and Performance in Fourteen Countries. London: Pergamon Press. Thorndike, R. L. (1973). Reading Comprehension Education in Fifteen Countries. Stockholm: Almqvist & Wiksell. PISA & PIRLS reports (consult their websites)

International Assessment of Language Competence: (How) Can it be Done?

International Assessment of Language Competence: (How) Can it be Done?

Presentation Transcript

COSIG Assessment Training

Cultural Competence: Can it be Taught?

Cultural and Linguistic Competence and the CLAS Standards

Phonological Assessment Remediation- CDS 3100

NC K-2 Literacy Assessment 2009

CELLA

Language and Linguistics

Right Hemisphere Syndrome: Characteristics and Assessment

Language Disorders

Fostering and assessing students’ intercultural competence

NAMI National’s Cultural Competence Assessment Results

Maintaining competence

Cultural Competence Seminar

Assessment

RISK ASSESSMENT

2 nd Annual Cultural Competence Seminar

Brownsville ISD

ICB IPMA Competence Baseline Version 3.0

Capacity vs. Competence

Language variation

CULTURAL COMPETENCE