650 likes | 756 Views
A Language Assessment Kit – Relating to the CEFR – for French and English. Peter Lenz IBE Seminar Warsaw, 20/10/2011. Overview of the presentation. Context Development Product / Use Looking back and forward / some thoughts. Overview of the presentation. Context Development
E N D
A Language Assessment Kit – Relating to the CEFR –for French and English Peter Lenz IBE Seminar Warsaw, 20/10/2011
Overview of the presentation Context Development Product / Use Looking back and forward / some thoughts
Overview of the presentation Context Development Product / Use Looking back and forward / some thoughts
2001 – EYL: Launch of CEFR & ELP 15+ in CH In 2001 the Swiss Conference of Cantonal Ministers of Education recommend to the cantons • to consider the CEFR in curricula (objectives and levels) in the recognition of diplomas • to facilitate wide use of the ELP 15+ make ELP accessible to learners help teachers to integrate ELP in their teaching • to develop ELPs for younger learners
Common European Framework of Reference… (CEFR) A commonreferenceforManyforeign-languageprofessionals • Courseproviders • Curriculum/syllabusdevelopers • Materials authors • Teachertrainers • Examinationproviders, etc. • A basis for the description of • Objectives • Contents • Methods • CEFR isn't prescriptive but asks the right questions and favors certain answers…
An action-oriented approach and Reference levels CEFR favors an action-oriented approach (language use in context) Main objectives relate to communicative language proficiency CEFR describes 6 reference levels: A1 through C2 • Means of description: • Descriptors of communicative language activities • Descriptors of "competences" (or "language resources" or qualitative aspects of language use)
Core elements of CEFR & ELP: scaled descriptors Proficiency or can-do descriptors I can deal with most situations likely to arise whilst travelling in an area where the language is spoken. I can enter unprepared into conversation on topics that are familiar, of personal interest or pertinent to everyday life (e.g. family, hobbies, work, travel and current events).
Core elements of CEFR: scaled descriptors Descriptors of competences or qualitative aspects Consistently maintains a high degree of grammatical accuracy; errors are rare, difficult to spot and generally corrected when they do occur.
The Concept of Illustrative Descriptors Illustrative descriptors may be considered as spotlights illuminating small areas of competence/proficiency while other areas remain in the dark. Spoken Production Spoken Interaction Listening Reading Writing D2 Can briefly give reasons and explanations for opinions, plans and actions. D3 D1 D4 D17 Descriptors outline and illustrate competence/proficiency levels but never define them exhaustively.
European Language Portfolios For the hands of the learners: 3 parts – 2 main functions: Lang. Passport Lang. Biographie Dossier Documentation Facilitation of learning
From the ELP 15+ to An ELP for learners age 11 to 15? - Teachers’ wish list: • More descriptors taylored to young learners‘ needs • Less abstract formulations • Self-assessment grid and checklists with finer levels • Tools facilitating “hard” assessment • Test tasks relating to descriptors • Marked and assessed learner texts • Assessed spoken learner performances on video Beyond an ELP's reach Assessment criteria for Speaking (and Writing) relating to finer levels
The initiators German-speaking cantonsof Switzerland Principality of Liechtenstein FL
The authorities‘ rationale Promotion of the quality and effectiveness of school-based foreign-language teaching and learning by improving the quality, coherence and transparency of assessment • CEFR as a basis further elaboration of Reference levels • Assessment and self-assessment instruments building upon descriptors • Teacher-training material and early involvement of teachers to prepare dissemination and introduction of the instruments in the school context
Overview of the presentation Context Development Product / Use Looking back and forward / some thoughts
ELP 11-15 Overview of expected products Benchmark performances(Speaking, Writing) Ready-made "diagnostic" test sets Assessment criteria (Speaking, Writing) Bank of validated test tasks( 5 “skills”; C-tests) (Self-)assessment grid & checklists Bank of target-group-specific descriptors (levels A1.1-B2.1)
Developing a Descriptor Bank Bank of target-group-specific descriptors (levels A1.1-B2.1)
Development of the descriptors How were the new can-do descriptors developed? 1) Collect from written sources (ELPs, textbooks, other sources) 2) Validate, complement the collection in teacher workshops • Teachers decide on relevance for target learners and on suitability for assessment • Teachers complement collection 3) Fine-tuning and selecting descriptors • Make formulations non-ambiguous and accessible; add examples • Select descriptors to cover whole range of levels A1.1 - B2.1 • Represent wide range of skills and tasks ~330 descriptorsfor empirical phase
Development of the descriptors Data collection – Teachers assess their pupilsFollowing Schneider & North‘s methodology for the CEFR
Too few learners at B2 Development of the descriptors Scaling: Link and anchor assessment questionnaires of 50 descriptors each, for different levels 2 parallel sets of descrip-tors of similar difficulty per assumed level Identical descriptors as links (& sometimes CEFR anchors)
Development of the descriptors Statistical analysis and scale-building (A1.1 - B1.2)
ELP 11-15 Self-assessment Grid and Checklists (Self-)assessment grid & checklists Bank of target-group-specific descriptors (levels A1.1-B2.1)
Reformulations: I can ... 1) Some Can do‘s are transformed into I can‘s Classes use descriptors for self-assessment and give feedback Can learners understand? 2)Whole bank of Can do‘s is transformed into I can statements
ELP 11-15 Overview of products Bank of validated test tasks( 5 “skills”; C-tests) (Self-)assessment grid & checklists Bank of target-group-specific descriptors (levels A1.1-B2.1)
Test Tasks 1) Test tasks relating to communicative language proficiency • Speaking tasks (production and interaction) • Writing tasks • Listening tasks • Reading tasks Test tasks correspond to (or operational-ize) one or more descriptor(s). 2) C-Tests (integrative tests) • C-Tests are a special type of CLOZE test.
Test Tasks 1) Test tasks relating to communicative language proficiency • Speaking tasks (production and interaction) • Writing tasks • Listening tasks • Reading tasks Test tasks correspond to (or operational-ize) one or more descriptor(s). 2) C-Tests (integrative tests) • C-Tests are a special type of CLOZE test. All test tasks were field-tested and attributed to CEFR levels using pupils' self-assessment or teacher assessment(common-person equating).
Test Tasks 1) Test tasks relating to communicative language proficiency • Speaking tasks (production and interaction) • Writing tasks • Listening tasks • Reading tasks Test tasks correspond to (or operational-ize) one or more descriptor(s). 2) C-Tests (integrative tests) • C-Tests are a special type of CLOZE test. • C-Tests are said to provide reliable information on a learner‘s linguistic resources. • C-Tests are quick. All test tasks were field-tested and attributed to CEFR levels using pupils' self-assessment or teacher assessment(common-person equating).
ELP 11-15 Criteria and Benchmark Performances Benchmark performances(Speaking, Writing) Assessment criteria (Speaking, Writing) Bank of validated test tasks( 5 “skills”; C-tests) (Self-)assessment grid & checklists Bank of target-group-specific descriptors (levels A1.1-B2.1)
CEFR Table 3 – the point of departure Descriptors of qualitative aspects of performance Consistently maintains a high degree of grammatical accuracy; errors are rare, difficult to spot and generally corrected when they do occur.
Assessment criteria for Speaking Where did the new qualitative criteria come from? – Steps taken: 1) Collect criteria • Collect criteria from various sources: CEFR, examination schemes ... 2) Generate & select criteria: teachers assess spoken performances • Teachers bring video recordings • Teachers describe differences between learner performances they can watch on video criteria emerge • Teachers select and apply descriptors from the existing collection • Teachers agree on essential categories (e.g. Vocabulary Range, Pronunciation/Int.) and agree on a scale for each analytical category 3) Prepare empirical validation (experts) • Decide on categories of criteria to be retained • Revise and complete proposed scales of analytical criteria • … and produce performances to apply the criteria to
Phase IV Producing video recordings of spoken performances One learner - different tasks in various settings 10 learners of English, 11 learners of French
Validation of criteria for Speaking Methodology • A total of 35 teachers (14 Fr, 21 En) apply • 58 analytical criteria (some from CEFR) belonging to 5 categories • 28 task-based can-do descriptors (matching the tasks performed ) • to 10 or 11 video-taped learners per language, each performing 3-4 spoken tasks Analytical criteria categories Interaction Vocabulary range Grammar Fluency Pronunciation & Intonation
“Statement doesn‘t apply to this pupil” “Statement generally applies to this pupil” “Statement applies to this pupil but s/he can do clearly better” CEFR Anchors Scaling the criteria for Speaking Criteria and questionnaires – a linked and anchored design Three assessment questionnaires for three different learner levels Links between questionnaires
Criteria for Speaking - analysis Teacher severity and consistency Severity:Some extreme raters (severe or lenient) show a strong need for rater training although every criterium makes a meaningful (but somewhat abstract) statement on mostly observable aspects of competence. Consistency:5 out of 35 raters were removed from the analysis due to misfit of up to 2.39 logits (infit mean square) Map for English
Criteria for Speaking – outcomes Statistical analysis indicates • that we have good quality criteria • which may be used to assess learners from A1.1 to B2 Statistical analysis also indicates • which of the video-taped learners are the least or most able • which raters (teachers) were severe or lenient • which raters rated consistently or inconsistently Useful findings for teacher training on the basis of these videos The assessment criteria for written performances were developed using a very similar methodology
ELP 11-15 Ready-made sets of test tasks Ready-made "diagnostic" test sets Benchmark performances(Speaking, Writing) Assessment criteria (Speaking, Writing) Bank of validated test tasks( 5 “skills”; C-tests) (Self-)assessment grid & checklists Bank of target-group-specific descriptors (levels A1.1-B2.1)
Ready-made sets of test tasks • Ready-made, class-specific bundles of test tasks for Listening, Reading, Speaking and Writing • Information and advice for teachers regarding preparations, use and scoring/score interpretation
Overview of the presentation Context Development Product / Use Looking back and forward / some thoughts
The Kit: Ring-binder and Data base Limited, non-personal licence
Elements: Test tasks Test tasks building upon descriptors C-Tests
Example: Listening task Instructions in German, the local L1
Example: Listening task Answer key Interpretation of scores in relation to CEFR levels.
Example: Spoken interaction task For use by teachers and also by learners
Example: Spoken interaction task For learner A