1 / 12

Washington D.C., March 23, 2012

DEPT, FERRARI and MENDELOVITS: How to analyze and exploit field test results with a view to maximizing cross-language comparability of main survey data collection instruments. Washington D.C., March 23, 2012. State of the Art. Proliferation of multilingual comparative studies

karif
Download Presentation

Washington D.C., March 23, 2012

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. DEPT, FERRARI and MENDELOVITS: How to analyze and exploit field test results with a view to maximizing cross-language comparability of main survey data collection instruments Washington D.C., March 23, 2012

  2. State of the Art • Proliferation of multilingual comparative studies • Survey design includes a pilot, or Field Test (FT) carried out on a smaller scale • items adapted in multiple languages before FT • key moment for linguistic quality control (LQC) : right before FT – translation verification • Comprehensive documentation of adaptation, adjudication and validation processes

  3. Between Field Test and Main Survey • In survey designs that include FT and MS, analysis of FT results = a wealth of information • Can be used to inform item selection • But also to perform a more focused linguistic and formal verification before MS • Open communication channels between item writers, national experts, verification coordinators

  4. The PISA paradigm • Inception in 2000, currently 5th survey cycle • Double translation, double source design • 32 national versions (2000) -> 85 n. v. (2012) • from pencil and paper to computer-delivered assessments and background questionnaires • compiling data on adaptation history of each item in each language

  5. Analysis of FT Results • At item level: item stats (itanals) • Item discrimination • Item fit • Ability ordering • Point biserial correlation (MCQ) • Differential item analysis • gender • country • language

  6. Multiple choice item: not dodgy Mean ability and standard deviation for the group of students who selected responses A, B, C or D Higher than 0.2 Item fit Should be negative for distractor Options A,B,C,D Key Answer Should be positive for key answer

  7. Multiple choice item: Dodgy Less than 0.2 Low discrimination between high and low achiever Value significantly higher than 1 (item discrimination between high and low achievers is less than expected)

  8. Action • Dodgy item reports sent • to countries • to cApStAn • reflect on the data, examine national version; explain why such results may have occurred. • As a result, FT to MS corrections proposed by: • Item writers / test developers • countries / verifiers

  9. Dodgy item

  10. MS version management • Base national MS version prepared for countries (using final FT version) • segment status indicates type of action • locked segments if no FT > MS changes • Country review followed by focused verification • Difference reports (before/after) generated automatically • Reports examined by referee • Final check on key corrections

  11. CHALLENGES • Convincing reviewers / verifiers thatif it isn’t broken, don’t fix it • Document each change with its justification • Check whether changes have not introduced new errors or inconsistencies • Make more systematic use of dodgy item reports, including for background questionnaires • Embed these processes in the platforms and IT adaptation management systems

  12. steve.dept@capstan.be Any Questions? Thank you

More Related