240 likes | 426 Views
Measuring Learning with WBIEG Level-2 Evaluation Toolkit. By Violaine Le Rouzic, Evaluation Officer, WBIEG March 27, 2005. What is a Level-2 evaluation? . Objective To help refine a course by: Measuring how much participants learned Assessing what participants learned Evaluation Design
E N D
Measuring Learning withWBIEG Level-2 Evaluation Toolkit By Violaine Le Rouzic, Evaluation Officer, WBIEG March 27, 2005
What is a Level-2 evaluation? Objective • To help refine a course by: • Measuring how much participants learned • Assessing what participants learned Evaluation Design • Test all participants’ knowledge of course contents at the start & the end of the course • Same difficulty on pre- & posttests, but different items, to avoid pretest recall effect Main Indicator • Group’s Learning Gain = Post-test – Pre-test
What is WBIEG Level-2 evaluation Toolkit? • A set of guidelines, templates, databases, and macro enabling course teams to assess their participants’ learning. • Adapt psychometrics to WB context to measure learning with fair confidence (Tradeoff between science and feasibility)
Main Toolkit features • To measure participant group learning at the course • Practical, short, step-by-step • Tasks divided between content experts & assistants • Accounts for short preparation time, few test takers • No evaluation knowledge required • Need basic Word, Excel & Internet browsing skills • Test form templates in over ten languages Not: theoretical, state of the art psychometrics Not: for certification of individual participants
For whom is the Toolkit? • World Bank course teams (for courses with external and/or internal participants) • World Bank managers or any WB staff who want to compare learning outcomes data across various criteria (division, years, etc.) • Course teams in other organizations can use the evaluation tools on external WB web site (but the test items and results database is for World Bank users only.)
When to use the Toolkit? • Level-2 evaluation is feasible: • Learning knowledge/skills is the main objective • Learning objectives are clear before the course • Every participant follows the same curriculum • Worth investing in Level-2 evaluation: • Course will be offered again • Long enough (at least 1 week recommended) • Many participants (30 or more recommended) • Enough resources and commitment: • Two staff weeks’ time for the evaluation • Commitment to use the evaluation results
Toolkit’s 13 Steps http://web.worldbank.org/WBSITE/EXTERNAL/WBI/0,,contentMDK:20270021~pagePK:209023~piPK:335094~theSitePK:213799,00.html
1. Plan the evaluation For course director • Course team resources needed: • 1 week of the content expert team (can be split) • 1 week of assistants Less time required on subsequent L2 evaluations • Through the course cycle • From early design stage to course re-design • Most time for test development before delivery • Start at early course design stage http://siteresources.worldbank.org/WBIINT/Resources/Plan-for-Level-2.pdf
2. Map the test For course director • Build a test specification matrix to determine: • Which content areas should be tested • To which cognitive domain each area relates • How many test items are needed per content area and cognitive domain (Recommended minimum total: 20 item pairs per test) • Objective: • Make the test representative of the course content http://siteresources.worldbank.org/WBIINT/Resources/How-to-build-matrix.pdf
3. Review past items (optional) For content experts of World Bank only • Consult a database with over 5,000 items used in over 100 WB courses with Level-2 evaluations • Search for keywords in offering titles or items • Potential benefits • Save time if some items fit your needs • Identify issues to avoid from past items • Get ideas on writing new items • Caution • Item quality is context-specific, don’t re-use blindly! http://intranet.worldbank.org/WBSITE/INTRANET/UNITS/WBIINT/0,,contentMDK:20191931~pagePK:135700~piPK:135698~theSitePK:136975,00.html
4. Write items For content experts • Match the content area and cognitive domain of the test specification matrix • Test items use multiple-choice format • All items have five response options (Last option is always “I don’t know.”) • Average difficulty level • Clearly stated http://siteresources.worldbank.org/WBIINT/Resources/How-to-write-items.pdf
5. Pair items For content experts • For each item, write an equivalent item • Same difficulty level • Same content area • Same cognitive domain • Same length • Same format • Examples in guidelines • Objective: Make pre- and post-test equivalent http://siteresources.worldbank.org/WBIINT/Resources/How-to-pair-items.pdf
6. Pilot tests For content experts (with assistants) • Have volunteers take the tests (or part of the test) before the course. Volunteers can be: • Other content experts (to check key) • Alumni • Participant look-alike • Non-content experts • BUT NOT the actual participants! • Collect comments and demographics with tests responses. Test without, then with key. http://siteresources.worldbank.org/WBIINT/Resources/How-to-pilot-items.pdf
7. Review test items For content experts (with assistants) • Use the pilot test responses and the Toolkit checklist to review each item for: • content • wording • using statistical item analysis (if any) • Finalize the items http://siteresources.worldbank.org/WBIINT/Resources/How-to-review-items.pdf
8. Produce test forms For assistants • Use automated template to randomly assign items to either pre- or post-test • Use test form templates (customize the templates, as needed) • Use formatting and production guidelines • Poor test form production can ruin all results! http://siteresources.worldbank.org/WBIINT/Resources/How-to-prepare-test-forms.pdf
9 &10. Collect test forms For any organizer on site • Collect pre-test at course start and post-test at course end • Have all participants answer • Have all participants write their codes on both forms to match results by respondent • Explain evaluation objectives & confidentiality http://siteresources.worldbank.org/WBIINT/Resources/How-to-collect-pre-test.pdf http://siteresources.worldbank.org/WBIINT/Resources/How-to-collect-post-test.pdf
11. Compute results For assistants • Follow tabulation guidelines • Enter responses in tabulation template • Click macro for automatic item analysis http://siteresources.worldbank.org/WBIINT/Resources/How-to-tabulate-responses.pdf
11A. Results example:a course learning gain and post-test score Learning gain: green if statistically significant; orange, if not. Compare with other courses Pre- and post-test scores (matched respondents) Post-test scores (all respondents)
11B. Results example: distribution of a course’s pre- and post-test scores Post-test to the right of pre-test: participants learned
11C. Results example: post-test responses by item Key Items’ text Responses
Check: item confusing or misconception not overcome 11D. Results example: a course item analysis Check: item too hard or course did not teach this well Check pre-post equivalence Most items statistically OK Confused high scorers Compare reliability (WB only)
12. Send results to WBIEG For assistants (WBI courses only) • WBIEG will: • Check quality of processing • Include test items and results in database • Report evaluation efforts and results on request http://web.worldbank.org/WBSITE/EXTERNAL/WBI/0,,contentMDK:20270039~pagePK:209023~piPK:335094~theSitePK:213799,00.html
13. Interpret results For course director & content experts • Review and interpret results using: • Interpretation guidelines • Results database (WB only) • Toolkit glossary • Decide how to improve next offering and next test http://siteresources.worldbank.org/WBIINT/Resources/How-to-interpret-results.pdf
Thanks to: • Developers: Joy Behrens, Guangbin Liu, WBIEG staff • Advisors: Marlaine Lockheed, William Eckert, Sukai Prom-Jackson, Zhengfang Shi, Gary Echternacht … Main contact: Violaine Le Rouzic